Method of Purifying RNA Binding Protein-RNA Complexes

ABSTRACT

The present invention provides methods for purifying RNA molecules interacting with an RNA binding protein (RBP), and the use of such methods to analyze a gene expression profile of a cell. The invention also provides sequences of RNA molecules that mediate binding to an RBP, proteins encoded by the sequences, a method of identifying the sequences, and the use of the sequences in a screen to identify bioactive molecules. The invention also provides RNA motifs found among the sequences and compounds that bind the RNA motifs. In addition, the invention provides methods of treating diseases associated with a function of an RNA binding protein.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims priority from U.S. Provisional Patent Application 60/513,183, filed Oct. 23, 2003, which is incorporated herein by reference

FIELD OF THE INVENTION

The present invention provides methods for purifying RNA molecules interacting with an RNA binding protein (RBP), and the use of such methods to analyze a gene expression profile of a cell. The invention also provides sequences of RNA molecules that mediate binding to an RBP, proteins encoded by the sequences, a method of identifying the sequences, and the use of the sequences in a screen to identify bioactive molecules. The invention also provides RNA motifs found among the sequences and compounds that bind the RNA motifs. In addition, the invention provides methods of treating diseases associated with a function of an RNA binding protein.

BACKGROUND OF THE INVENTION

RNA binding proteins (RBPs) are frequently targets of human autoimmune or genetic neurologic diseases. Notable examples among autoimmune disease include systemic lupus erythematosis, primary biliary cirrhosis (PBC) and Sjogren's syndrome, and among neurologic disease include the paraneoplastic neurologic antigens Nova and Hu, and the Fragile X mental retardation FMR1 protein, the spinal muscular atrophy SMN protein, the myotonic dystrophy CELF proteins, and the spinocerebellar ataxia SCA1 protein. Understanding the role these proteins play in disease, normal biology, and in the brain requires methods to identify the set of RNAs they bind to in vivo, and the use of mouse models of these disorders for RNA target validation. The targets of RBPs involved in a number of autoimmune and genetic diseases have been difficult to identify, however. Accordingly, the present invention provides methods for purifying RNA molecules interacting with an RNA

SUMMARY OF THE INVENTION

In one embodiment, the present invention provides a method for purifying an RNA molecule interacting with an RBP of interest in a biological sample, comprising the steps of: (a) contacting the biological sample with an agent that creates a covalent bond between the RNA molecule and the RBP of interest, thereby generating a covalently bound RBP-RNA complex containing the RNA molecule; (b) cleaving the RNA molecule by contacting the RBP-RNA complex with an agent capable of cleaving a bond thereof, thereby generating a fragment of the RNA molecule, wherein the fragment is at least 22 nucleotide bases in length; (c) selecting the RBP-RNA complex with a molecule that specifically interacts with a component of the RBP-RNA complex; and (d) purifying the RBP-RNA complex under stringent conditions, thereby purifying an RNA molecule interacting with an RBP of interest.

In another embodiment, the present invention provides a method for purifying an RNA molecule interacting with an RBP of interest in a biological sample, comprising the steps of (a) contacting the biological sample with an agent that creates a covalent bond between the RNA molecule and the RBP of interest, thereby generating a covalently bound RBP-RNA complex containing the RNA molecule; (b) cleaving the RNA molecule with an agent capable of cleaving a bond thereof, thereby generating a fragment of the RNA molecule, wherein the fragment is at least 22 nucleotide bases in length; (c) selecting the RBP-RNA complex with a molecule that specifically interacts with a component of the RBP-RNA complex; and (d) purifying the RBP-RNA complex, wherein the purifying step comprises an agent that disrupts an intermolecular interaction, thereby purifying an RNA molecule interacting with an RBP of interest.

In another embodiment, the present invention provides a method for purifying an RNA molecule interacting with an RBP of interest in a biological sample, comprising the steps of (a) contacting the biological sample with an agent that creates a covalent bond between the RNA molecule and the RBP of interest, thereby generating a covalently bound RBP-RNA complex containing the RNA molecule; (b) cleaving the RNA molecule with an agent capable of cleaving a bond thereof, thereby generating a fragment of the RNA molecule, wherein the fragment is at least 22 nucleotide bases in length; (c) selecting the RBP-RNA complex with a molecule that specifically interacts with a component of the RBP-RNA complex; and (d) purifying the RBP-RNA complex, wherein the purifying comprises a chromatographic method, thereby purifying an RNA molecule interacting with an RBP of interest.

In another embodiment, the present invention provides a method for identifying a plurality of RNA molecules interacting with a known RBP in a biological sample, comprising the following steps: (a) contacting the biological sample with an agent that results in a plurality of covalently bound RBP-RNA complexes in the biological sample; (b) obtaining RNA fragments of at least 22 bases in length from the biological sample; selecting a plurality of RBP-RNA complexes of interest with a molecule that specifically interacts with the known RBP; purifying the plurality of RBP-RNA complexes of interest under stringent conditions; and identifying a plurality of RNA molecules in the RBP-RNA complexes of interest; thereby identifying a plurality of RNA molecules interacting with a known RBP in a biological sample.

In another embodiment, the present invention provides a method of screening a test compound for its ability to modulate expression of a gene in a cell, comprising the steps of: (a) purifying a first plurality of RNA binding protein-RNA complexes from the cell by the CLIP method, wherein the cell has been contacted with the test compound; (b) identifying a first plurality of RNA molecules in the first plurality of RBP-RNA complexes; (c) assessing an amount of the gene among the first plurality of RNA molecules; (d) purifying a second plurality of RNA binding protein-RNA complexes from the cell by the CLIP method, wherein the cell has not been contacted with the test compound; (e) identifying a second plurality of RNA molecules in the second plurality of RBP-RNA complexes; and (f) assessing an amount of the gene among the second plurality of RNA molecules; wherein a difference between the amount of the gene in the first plurality of RNA molecules and the amount of the gene in the second plurality of RNA molecules indicates an ability of the test compound to modulate expression of a gene in a cell.

In another embodiment, there are provided nucleotide linkers comprising a sequence as set forth in SEQ ID No 477-502.

In another embodiment, there is provided a compound which interacts with a motif of an isolated nucleic acid sequence as set forth in SEQ ID No 1-335.

In another embodiment, there is provided a compound which interacts with a motif of an isolated nucleic acid sequence as set forth in SEQ ID No 336-449.

In another embodiment, there is provided a compound which interacts with a motif of an isolated nucleic acid sequence as set forth in SEQ ID No 1-78.

In another embodiment, there is provided a method of generating a gene expression profile of a cell, tissue, or biological sample in vivo, comprising: purifying an RBP-RNA complex from the cell according the method for purifying RNA interacting with RBPs of interest, wherein RNA bound to the RBP-RNA complex comprises a subset of the mRNA of the cell; and then identifying the mRNAs of the subset, thereby generating the gene expression profile of the cell, tissue, or biological sample.

In another embodiment, the present invention provides a method of treating a disease or disorder in a subject, wherein the disease or disorder is associated with a function of an RNA binding protein, comprising contacting a cell in the subject with an agent that modulates an expression or activity of a gene, or a protein encoded by the gene, function of an RNA binding protein, comprising contacting a cell in the subject with an agent that modulates an expression or activity of a gene, or a protein encoded by the gene, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 1-335, thereby treating a disease or disorder in a subject.

In another embodiment, the present invention provides a method of treating a disease or disorder in a subject, wherein the disease or disorder is associated with a function of an RNA binding protein, comprising contacting a cell in the subject with an agent that modulates an expression or activity of a gene, or a protein encoded by the gene, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 336-449, thereby treating a disease or disorder in a subject.

In another embodiment, the present invention provides method of diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 1-335, comprising assessing a splicing pattern of the transcript in a biological sample from the subject; assessing a splicing pattern of a reference standard; and comparing the splicing pattern of the transcript to the splicing pattern of a reference standard, thereby diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject.

In another embodiment, the present invention provides method of diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 336-449, comprising assessing a splicing pattern of the transcript in a biological sample from the subject; assessing a splicing pattern of a reference standard; and comparing the splicing pattern of the transcript to the splicing pattern of a reference standard, thereby diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject.

In another embodiment, the present invention provides a method of assessing a level of association of an RNA transcript of interest with an RBP of interest, comprising the steps of: (a) contacting an RBP-RNA complex containing said RBP of interest with an agent that creates a covalent bond between two components of said RBP-RNA complex; (b) cleaving an RNA molecule of said RBP-RNA complex with an agent capable of cleaving a bond of said RNA molecule, thereby generating a fragment of said RNA molecule, wherein said fragment is at least 22 nucleotide bases in length; (c) selecting said RBP-RNA complex with a molecule that specifically interacts with a component thereof; (d) purifying said RBP-RNA complex, wherein said purifying comprises a chromatographic method; and (e) assessing a presence or amount of said RNA transcript of interest or a fragment thereof in said plurality of RNA molecules, thereby assessing a level of association of an RNA transcript of interest with an RBP of interest.

In another embodiment, the present invention provides a method of screening a test compound for its ability to modulate a level of association between an RBP and an RNA transcript, comprising the steps of: (a) assessing a first level of association between said RBP and said RNA transcript in a first cell by the method of claim 112, wherein said first cell has been contacted with said test compound; (b) assessing a second level of association between said RBP and said RNA transcript in a second cell by the method of claim 112, wherein said second cell has not been contacted with said test compound; and (c) comparing said first level of association with said second level of association, wherein a difference between said first level of association and said second level of association indicates an ability of said test compound to modulate a level of association between said RBP and said RNA transcript.

In another embodiment, there is provided an RBP binding site comprised of a nucleic acid comprising a sequence as set forth in SEQ ID No 1-335 and 450-469.

In another embodiment, there is provided an RBP binding site comprised of a nucleic acid comprising a sequence as set forth in SEQ ID No 336-449 and 503-508.

In another embodiment, there is provided a method of modifying an expression profile of a gene of interest comprising engineering the gene of interest to comprise an RBP binding site comprising a nucleic acid sequence as set forth in SEQ ID No 1-449, 450-469, and 503-508, thereby modifying the expression profile of the gene of interest.

In another embodiment, there is provided the use of a gene that has been engineered to comprise an RBP binding site comprising a nucleic acid sequence as set forth in SEQ ID NO 1-449, 450-469, and 503-508, in order to compete for biological factors that bind the sites of a gene of interest, thus modifying the splicing pattern of the gene of interest.

In another embodiment, there is provided an isolated nucleic acid that comprises a sequence set forth in SEQ ID NO 63, 64, 76, 77, 78, 84, or 292-335.

In another embodiment, there is provided an isolated nucleic acid that comprises a sequence set forth in SEQ ID No 374, 377, 378, 380, 382, 384, 387, 394-396, 415, or 416, 421.

In another embodiment, there is provided an oligonucleotide of at least 15 bases, with a nucleic acid sequence corresponding to SEQ ID NO 1-335, or 336-449, or a complementary sequence thereof.

In another embodiment, there is provided an isolated peptide encoded by a nucleic acid sequence as set forth in SEQ ID No 63, 64, 76, 77, 78, 84, or 292-335.

In another embodiment, there is provided an isolated peptide encoded by a nucleic acid sequence as set forth in SEQ ID No 374, 377, 378, 380, 382, 384, 387, 394-396, 415, or 416, 421.

In another embodiment, there is provided a transgenic mouse comprising a mutation in a Nova-2 gene.

In another embodiment, the present invention provides a method for purifying an RBP present in an RBP-RNA complex containing a known component, comprising the steps of (a) contacting said RBP-RNA complex with an agent that creates a covalent bond between two components of said RBP-RNA complex; (b) cleaving an RNA molecule of said RBP-RNA complex with an agent capable of cleaving a bond of said RNA molecule, thereby generating a fragment of said RNA molecule, wherein said fragment is at least 22 nucleotide bases in length; (c) selecting said RBP-RNA complex with a molecule that specifically interacts with said known component; (d) purifying said RBP-RNA complex under stringent conditions; and (e) removing said RBP from said RBP-RNA complex, thereby purifying an RBP present in an RBP-RNA complex containing a known component.

A method for identifying an unknown RBP present in an RBP-RNA complex containing a known component, comprising the steps of:

-   a. contacting a biological sample with an agent that results in a     covalently bound RBP-RNA complex in the biological sample; -   b. obtaining RNA fragments from the biological sample; -   c. selecting the RBP-RNA complex containing the known component with     a molecule that specifically interacts with the known component; -   d. purifying the RBP-RNA complex containing the known component     under stringent conditions; and -   e. identifying the unknown RBP from the RBP-RNA complex containing     the known component.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts the CLIP method. (A) Schematic of CLIP method. (B). Purification of Nova-1-RNA covalent complexes by SDS-PAGE. Adult mouse hindbrain tissue was UV irradiated (+XL), protein-RNA complexes immunoprecipitated with Nova antiserum, RNA labeled with ³²P, and complexes visualized by autoradiography. Without UV irradiation (−XL), no protein-RNA complexes were detected. (C) CLIP performed using mouse forebrain (postnatal day 6; N2^(+/+)) revealed ˜70 kilodalton (kDa) and ˜55 kD RNA-protein complexes cross-linked to Nova-2 (N2)(˜70 kDa upper band) and smaller isoforms of Nova-1 and Nova-2 (˜55 kDa lower band). When Nova-2^(−/−) forebrain was used for CLIP, the 70 kDa Nova-2 band was absent. When the UV cross-linked sample was immunoprecipitated with normal rabbit serum (Control IP), no cross-linked protein-RNA complex was apparent.

FIG. 2. Directional cloning of purified cross-linked RNA using RNA primers and T4 RNA ligase. A. ³²P-labeled RNA was purified from N2A cells following UV cross-linking and IP with anti-Nova antiserum. The size of the RNA fragments ranged from 24-150 bases; the modal size of the RNA was approximately 60 bases. B. Here, the purified RNA fragments were ligated to 5′ and 3′ linker oligonucleotides, which added 16 bases to each end of the molecule. The majority of the labeled RNA fragments shifted in size by 32 bases, indicating successful ligation. C. RNA isolated from regions 1 and 2 in B were amplified by RT-PCR with specific primers. The prominent band at 32 bases was product from the ligation of the two RNA oligonucleotides without insert. D. The products in C were further divided and further amplified by PCR. These products are then used for cloning and sequencing of the RNA insert tags.

FIG. 3. Analysis of 340 Nova CLIP fragments. (A) Genomic location of the tags. Tags belonging to genomic regions with no annotated transcripts were labeled as ‘unclassified’. 189 fragments aligned to introns within pre-mRNA, 107 to mature mRNA, and 55 to genomic regions to which no transcript has been assigned as yet. (B) YCAY tetramer abundance in Nova and Hu CLIP fragments in comparison to control tags. The average number of YCAY tetramers per Nova CLIP fragment was 4.18 (99% confidence interval ±0.39, average tag length 71±18 nucleotides; n=340) compared with 1.7 per CLIP fragment of an unrelated RBP, Hu (99% confidence interval ±0.21, average tag length 62±16 nucleotides; n=94) and 1.1 per random fragment of transcribed genomic sequence (Control; 99% confidence interval ±0.03, average tag length 71±18 nucleotides; n=3400). (C) Frequency of nucleotides flanking all CA dimers in Nova CLIP and genomic tags. Certain nucleotides (A in position 1, U in position 3, C in position 4, A in position 5, U in position 6, and C in position 7; in the control sequence, C in position 4 and A in position 5) had a frequency 25% higher than expected on the basis of total nucleotide composition of 29% U, 35% C, 20% A and 17% G in Nova CLIP fragments. (D) The 5 most frequent hexamers in Nova CLIP fragments, and the ratio of the average observed/expected abundance compared to control tags; even the least abundant of these, UCAUCC, was in 10 fold excess relative to the average control tag (p=0.013, z-test). (E) Filter binding assay results, using Nova-2 fusion protein at the indicated concentrations, and synthetic RNAs shown in (F). (F). Sequences of transcribed CLIP fragment RNAs, and control RNAs corresponding to genomic sequence immediately 5′ to the CLIP fragments, used in filter binding assay. (G). Distribution of tags relative to number of YCAY tetramers they contain. (H) Annotated list of Nova CLIP fragments.

FIG. 4. Nova-dependent regulation of JNK2 (A), neogenin (B) and gephyrin (C) alternative splicing. Schematic of pre-mRNA alternative splicing in the vicinity of the Nova CLIP fragments are shown on left. Autoradiograms (center) of RT-PCR products for each transcript, which were generated using RNA isolated from Nova 2^(+/+) (WT) and Nova 2^(−/−) (KO) P6-7 brain cortex, show the migration of bands corresponding to specific spliced isoforms. Each autoradiogram was quantitated and plotted with the standard error from 3 litters. In (A), asterisk marks a minor splice variant (isoform IV), and JNK2 PCR products present after digestion with AluI are illustrated. In (C), Western blot analysis of gephyrin protein in Nova 2 P1 WT and KO cortex is shown. (D) RT-PCR analysis of gephyrin exon 9 splicing in indicated mouse tissues.

FIG. 5. (A). List of 21 multiple-hit Nova CLIP fragments organized by primary encoded function. (B). A list of Nova CLIP fragments belonging to transcripts coding for proteins with a role in inhibitory control.

FIG. 6. Somatodendritic Nova: simultaneous detection with gephyrin in the postsynaptic cytoplasm. (A) Immunoblot analysis of Nova distribution in cytoplasmic and nuclear fractions from P7 mouse brain (equal volumes of each fraction were loaded in lanes 1 and 2; 50 micrograms (μg) of protein was loaded in lanes 3 and 4). Hsp90 was used as a cytoplasmic marker, and brPTB as nuclear marker. (B) Nova immunoreactivity within the post-synaptic dendrite; gold particles associated with small cisternae. (C1) Nova immunoreactivity in motor neurons was within the nucleus and somatodendritic compartments; in dendrites Nova accumulated at the dendritic periphery (arrowheads) and branch points (arrow). (C2-4) Colocalization of Nova (in red-2 and 3; red areas in original color photo are indicated by arrowheads) (in green-4; green areas in original color photo are indicated by arrowheads) and synapsin (in green-2 and 3; green areas in original color photo are lighter areas not indicated by arrowheads) and gephyrin (in red-4; red areas in original color photo are lighter areas not indicated by arrowheads). Anti-Nova antibodies: affinity purified rabbit anti-Nova (C1 and C3) human POMA serum (C2 and C4). Scale bar: 15 μm (C1); 9 μm (C2, C3); 6 μm (C4). (D) Electron microscopic co-detection of Nova and gephyrin immunoreactivity within the postsynaptic cytoplasm. A1, A2, B, Nova-1-IR (10 nm gold particles, arrows) beneath post-synaptic regions gephyrin IR (15 nm gold particles, arrowheads) A2, high magnification of the postsynaptic cytoplasm of A1. Synaptic boutons are indicated by the symbol “b”. Antibodies: anti-Nova rabbit serum. Scale bar: 0.2 μm (A2, B); 0.4 μm (A1).

FIG. 7. Colocalization of Nova protein and GlyRα2 mRNA in motor neurons. (A, B) Fluorescence microscopy and FISH demonstrated GlyRα mRNA and Nova immunoreactivity at the dendritic periphery, which demonstrated accumulation of GlyRα mRNA and Nova protein at the site (arrows). (C-F) Nova protein (HRP immuno-labeling) and GlyRα2 mRNA ISH signal (gold particles) formed aggregates within the dendritic cytoplasm (arrows) and in front of synaptic boutons (arrowheads, b). Note in (D) and (F) the association of the Nova protein and mRNA signals with small cisternae. Anti-Nova antibodies: human POMA serum. Scale bar: 0.2 μm (B-E); 0.35 μm (D).

FIG. 8A. Annotated list of 114 neuronal Hu protein CLIP fragment sequences. B. Sequence alignment of Hu CLIP fragment sequences.

FIG. 9. Generation and characterization of Nova-2 knockout mice. A. Cloning scheme. B. Southern blot showing deletion of Nova-2 from genomic DNA. C. Western blot showing elimination of 55 KDa and 70 KDa isoforms of Nova-2 from knockout animals. D. Size reduction of Nova-2 knockout mice. E. Survival curve of Nova-2 mice.

FIG. 10. Alternative splicing defects in Nova-2 knockout Mice. A. Nova-2 mice are deficient in inclusion of γ2 L exon of GABAγ2 RNA. B. Splicing pattern of alternately spliced exons in GABAγ2, GlyR α2, Nova, src, and ICH-1 in Nova-2 knockout mice.

DETAILED DESCRIPTION OF THE PRESENT INVENTION

The present invention provides, in one embodiment, methods of purifying an RNA molecule interacting with an RBP of interest, comprising covalent cross-linking and immunoprecipitation, and the use of such methods to analyze association of an RNA transcript with an RBP. The invention also provides sequences of RNA molecules that mediate binding to an RBP, proteins encoded by the sequences, methods of identifying the sequences, and the use of the sequences in a screening assay to identify bioactive molecules. The invention also provides RNA motifs found among the sequences and compounds that bind to the RNA motifs. In addition, the invention provides methods of treating diseases associated with a function of an RNA binding protein

In another embodiment, the present invention provides a method for purifying an RNA molecule interacting with an RBP of interest in a biological sample, comprising the steps of: (a) contacting the biological sample with an agent that creates a covalent bond between the RNA molecule and the RBP of interest, thereby generating a covalently bound RBP-RNA complex containing the RNA molecule; (b) cleaving the RNA molecule by contacting the RBP-RNA complex with an agent capable of cleaving a bond thereof, thereby generating a fragment of the RNA molecule, wherein the fragment is at least 22 nucleotide bases in length; (c) selecting the RBP-RNA complex with a molecule that specifically interacts with a component of the RBP-RNA complex; and (d) purifying the RBP-RNA complex under stringent conditions, thereby purifying an RNA molecule interacting with an RBP of interest. The general term for methods of the present invention for purifying an RNA molecule interacting with an RBP of interest, comprising covalent cross-linking and immunoprecipitation, is “the CLIP method” (cross-linking and immunoprecipitation method).

In another embodiment, the present invention provides a method of identifying an RNA molecule interacting with an RBP of interest, comprising purifying an RNA molecule interacting with the RBP of interest by the CLIP method, and identifying the RNA molecule, thereby identifying an RNA molecule interacting with an RBP of interest.

In one embodiment, the plurality of RNA molecules interacting with the known RBP are analyzed to generate a profile of RNA molecules interacting with the known RBP in the biological sample.

In one embodiment, the RNA fragments that are generated, or the RNA in the RBP-RNA complex, is any type of RNA known in the art, such as, for example, tRNA (transfer RNA), snRNA (small nuclear RNA), rRNA (ribosomal RNA), mRNA (messenger RNA), anti-sense RNA, small inhibitory RNA (siRNA), micro RNA (miRNA) and ribozymes.

In one embodiment, as shown in FIG. 1A, a biological sample is contacted with an agent that results in a covalently bound RBP-RNA complex (step 1 of FIG. 1A). In this embodiment, the biological sample is mouse brain tissue that has been disassociated into a cell suspension, and the agent is UV irradiation of 254 nm wavelength. As a result of contact with the agent, a covalent bond is formed between an RNA molecule and a protein molecule in close contact with the RNA molecule. This RNA molecule covalently bound to a protein molecule is referred to herein as a covalently bound RBP-RNA complex. In this embodiment, cells are collected and lysed with detergent and high salt, which disassociates RNA from protein in some RBP-RNA complexes that are not covalently bound.

Covalent cross-linking has the advantage of forming bonds between RBP-RNA complexes that are in direct contact. Covalent binding enables the use, in some embodiments of the present invention, of rigorous purification schemes for obtaining highly purified RBP-RNA complexes. In one embodiment, the purification scheme may comprise immunoprecipitation. In another embodiment, the purification scheme may comprise rigorous washing of immunoprecipitates. In another embodiment, the purification scheme may comprise boiling complexes in SDS. In another embodiment, the purification scheme may comprise separating complexes on SDS-PAGE. In another embodiment, the purification scheme may comprise transferring samples to NC, which retains RNA-protein complexes, but not free RNA.

In another embodiment, the covalent bond enables partial cleavage of RNA molecules without affecting their protein binding, such that only short RNA fragments can be purified. In another embodiment, this partial cleavage enables identification of the region of the RNA responsible for binding to the RBP or another. In another embodiment, this partial cleavage facilitates purification of the RBP-RNA complex. Following purification of RNA-protein complexes that have been covalently cross-linked, protein can be digested with nucleases, and RNA can be cloned using linker ligation and RT-PCR. Each of these methods represents a separate embodiment of the present invention, and is described in more detail herein.

In another embodiment, the present invention provides a method for purifying an RNA molecule interacting with an RBP of interest in a biological sample, comprising the steps of (a) contacting the biological sample with an agent that creates a covalent bond between the RNA molecule and the RBP of interest, thereby generating a covalently bound RBP-RNA complex containing the RNA molecule; (b) cleaving the RNA molecule with an agent capable of cleaving a bond thereof, thereby generating a fragment of the RNA molecule, wherein the fragment is at least 22 nucleotide bases in length; (c) selecting the RBP-RNA complex with a molecule that specifically interacts with a component of the RBP-RNA complex; and (d) purifying the RBP-RNA complex, wherein the purifying step comprises an agent that disrupts an intermolecular interaction, thereby purifying an RNA molecule interacting with an RBP of interest.

In another embodiment, the present invention provides a method for purifying an RNA molecule interacting with an RBP of interest in a biological sample, comprising the steps of (a) contacting the biological sample with an agent that creates a covalent bond between the RNA molecule and the RBP of interest, thereby generating a covalently bound RBP-RNA complex containing the RNA molecule; (b) cleaving the RNA molecule with an agent capable of cleaving a bond thereof, thereby generating a fragment of the RNA molecule, wherein the fragment is at least 22 nucleotide bases in length; (c) selecting the RBP-RNA complex with a molecule that specifically interacts with a component of the RBP-RNA complex; and (d) purifying the RBP-RNA complex, wherein the purifying comprises a chromatographic method, thereby purifying an RNA molecule interacting with an RBP of interest.

In another embodiment, the present invention provides a method for purifying an RBP present in an RBP-RNA complex containing a known component, comprising the steps of (a) contacting said RBP-RNA complex with an agent that creates a covalent bond between two components of said RBP-RNA complex; (b) cleaving an RNA molecule of said RBP-RNA complex with an agent capable of cleaving a bond of said RNA molecule, thereby generating a fragment of said RNA molecule, wherein said fragment is at least 22 nucleotide bases in length; (c) selecting said RBP-RNA complex with a molecule that specifically interacts with said known component; (d) purifying said RBP-RNA complex under stringent conditions; and (e) removing said RBP from said RBP-RNA complex, thereby purifying an RBP present in an RBP-RNA complex containing a known component.

In one embodiment, the term “contacting”, “contact” or “contacted” when in reference to a cell refers to direct exposure of the cell to an agent, compound or composition of the invention. In another embodiment, the term “contacting”, “contact” or “contacted” when in reference to a cell refers to indirect exposure of the cell to an agent, compound or composition of the invention. In one embodiment, contacting a cell may comprise subjecting the cell to electromagnetic radiation. In another embodiment, a cell is exposed directly to a chemical that forms the covalent bond. In another embodiment, supply to the cell is indirect, such as via provision in a culture medium that surrounds the cell. In another embodiment, contacting a cell may comprise direct injection of the cell through any means well known in the art, such as microinjection.

In one embodiment, a “target cell” can be, for example, a type of cell or tissue in an organism, or a single cell type, e.g., grown in tissue culture. In another embodiment, an expression vector-ligand complex can be formed in which the ligand comprises a fusogenic viral peptide to disrupt endosomes, allowing the expression vector to avoid lysosomal degradation. In yet another embodiment, the expression vector can be targeted in vivo for cell specific uptake and expression, by targeting a specific receptor (see, e.g., PCT Publications WO 92/06180 dated Apr. 16, 1992; WO 92/22635 dated Dec. 23, 1992; WO92/20316 dated Nov. 26, 1992; WO93/14188 dated Jul. 22, 1993; WO 93/20221 dated Oct. 14, 1993). Alternatively, the expression vector can be introduced intracellular and incorporated within host cell DNA for expression, by homologous recombination (Koller and Smithies, 1989, Proc. Natl. Acad. Sci. USA 86:8932-8935; Zijistra et al., 1989, Nature 342:435-438). Each of these methods represents a separate embodiment of the present invention.

In one embodiment, the biological sample of step (a) is from a healthy source. In another embodiment, the biological sample of step (a) is from a diseased source. In another embodiment, the biological sample of step (a) may comprise a cell culture. In another embodiment, the biological sample of step (a) may comprise a cell line. In another embodiment, the biological sample may comprise a cell extract. In another embodiment, the biological sample may comprise a cell lysate. In another embodiment, the biological sample may comprise whole tissue. In another embodiment, the biological sample may comprise a tissue extract. In another embodiment, the biological sample is a tissue sample, such as, for example, a biopsy. In another embodiment, the biological sample may comprise a whole organ. In another embodiment, the biological sample may comprise a tumor. In another embodiment, the biological sample may comprise a tumor cell. In another embodiment, the biological sample may comprise a cell mass. In another embodiment, the tissue sample may comprise diseased tissue. In another embodiment, the biological sample may comprise a tumor cell or tumor cell extract. In another embodiment, the biological sample may comprise a pre-cancerous lesion, polyp, or cyst. In another embodiment, the biological sample may comprise a combination thereof. Each possibility represents a separate embodiment of the present invention.

In another embodiment, the biological sample may comprise a cellular component or compartment. In one embodiment, the cellular component or compartment is neuronal dendrites. In one embodiment, laser capture micro-dissection is used in conjunction with the CLIP method to purify RBP-RNA complexes, from, for example, molecular layers of brain, tumor sections, or tumor cells. Laser capture micro-dissection is, in one embodiment, performed by any technique known to those skilled in the art, such as, for example, the techniques described in Biotechniques. 34:42-46. Each such technique represents a separate embodiment of the present invention.

In one embodiment, cells comprising the biological sample are suspension cells. In another embodiment, the cells are adherent cells. In another embodiment, the cells are transformed cells. In another embodiment, the cells are tissue culture cells. In another embodiment, the cells are primary cell lines. Cells comprising the biological sample are, in one embodiment, grown in any method known to one skilled in the art. Each such method represents a separate embodiment of the invention.

In one embodiment, the biological sample is disrupted, disaggregated, homogenized, or lysed by any technique known in the art. For example, the biological sample may be made into a single-cell suspension using a nylon filter or mesh. Cells or tissue comprising the biological sample may, in one embodiment, be adhered to a substrate such as a chip, a slide, a dish, etc. The cells are, in one embodiment, washed according to techniques known to one skilled in the art. Each such technique represents a separate embodiment of the present invention.

In one embodiment, the covalent bond of step (a) is formed with irradiation. The source of irradiation may emit, in one embodiment, radiation of a discrete wavelength. In another embodiment, the source may emit radiation dispersed throughout a region of the electromagnetic radiation spectrum. In another embodiment, the source may emit a mixture of radiation, some of which is of a discrete wavelength, and some of which is dispersed throughout a region of the electromagnetic radiation spectrum.

In one embodiment, the irradiation may result from a polychromatic irradiation source. Polychromatic refers, in one embodiment, to a source that emits radiation of various wavelengths. Such wavelengths may be anywhere in the electromagnetic radiation spectrum. The radiation emission spectra of various types of irradiation sources are known in the art.

In another embodiment, the irradiation may result from a monochromatic irradiation source. Monochromatic refers, in one embodiment, to a source that emits radiation of a single wavelength. In another embodiment, monochromatic refers to a source that emits radiation primarily of a single wavelength.

In another embodiment, the irradiation may result from a mercury light. Mercury lamps emit radiation of 254 nm, and may also have polychromatic background emissions at other discrete wavelengths, e.g., 313 nm, 365 nm, 405 nm, 436 nm, 546 nm, 579 nm, 1015 nm and 1140 nm. This is a fairly unique characteristic of these types of lamps (see U.S. Pat. No. 6,611,375).

In another embodiment, the irradiation may result from a two-photon excitation apparatus (So P T et al, Cell Mol Bio (Noisy le grand) 44:771). In this technique, small structures are formed by multiple photon-induced polymerization or cross-linking of a precursor composition. “Multiple photon” as used herein means, in one embodiment, the simultaneous absorption of multiple photons by a reactive molecule. This method is described in detail in U.S. Pat. No. 6,316,153 and references therein.

In one embodiment, the irradiation used to form the covalent bond of step (a) is ultraviolet irradiation. Ultraviolet radiation, in one embodiment, is a form of energy that occupies a portion of the electromagnetic radiation spectrum (the electromagnetic radiation spectrum ranges from cosmic rays to radio waves). Ultraviolet radiation can come from many natural and artificial sources. Depending on the source of ultraviolet radiation, it may be accompanied by other (non-ultraviolet) types of electromagnetic radiation (e.g. visible light).

Particular types of ultraviolet radiation are herein described in terms of wavelength. Wavelength is herein described in terms of nanometers (“nm”). In one embodiment, ultraviolet radiation extends from approximately 180 nm to 400 mm. In another embodiment, the ultraviolet radiation has a wavelength of about 254 nm. In another embodiment, the ultraviolet radiation has a different wavelength. When a radiation source, by virtue of filters or other means, does not allow radiation below a particular wavelength (e.g. 320 mm), it is said to have a low end “cutoff” at that wavelength (e.g. “a wavelength cutoff at 300 nanometers”). Similarly, when a radiation source allows only radiation below a particular wavelength (e.g. 360 nm), it is the to have a high end “cutoff” at that wavelength (e.g. “a wavelength cutoff at 360 nanometers”). In another embodiment, the source of ultraviolet radiation is a fluorescent source. All of these sources represent separate embodiments of the present invention. In one embodiment, the device of the present invention comprises an additional filtering means. In one embodiment, the filtering means comprises a liquid filter solution that transmits only a specific region of the electromagnetic spectrum. The use of sources of irradiation is well known to those skilled in the art (see, for example Diffey, B L, Methods 28:4-13; and Chen J et al, Cancer J. 8:154-63). Each type of radiation represents a separate embodiment of the present invention.

In one embodiment, a chemical group such as, for example, puromycin is added to RNA to facilitate formation of the covalent bond of step (a). This method is described in Rodriguez-Fonseca C et al (RNA 6:744-54).

In one embodiment, the covalent bond of step (a) is formed with a chemical. In one embodiment, the chemical is formaldehyde. In another embodiment, the chemical is a derivative of formaldehyde. In another embodiment, the chemical is paraformaldehyde. In another embodiment, the chemical is glutaraldehyde. In another embodiment, the chemical is osmium tetroxide. In another embodiment, the chemical is acetone. In another embodiment, the chemical is an alcohol. In another embodiment, the chemical is an NHS ester. In another embodiment, the chemical is a Maleimides. In another embodiment, the chemical is a haloacetyl. In another embodiment, the chemical is a pyridyl disulfide. In another embodiment, the chemical is a sulfhydryl modifier such as SATA, SPDP or Traut's Reagent. In another embodiment, the chemical is hydrazide. In another embodiment, the chemical is 1-Ethyl-3-(3-Dimethylaminopropyl)-Carbodiimide Hydrochloride. In another embodiment, the chemical is an aryl azide or a derivative thereof. In another embodiment, the chemical is any other cross-linking compound known in the art. The cross-linking compound may, in one embodiment, be applied over a broad range of concentrations. Each type of chemical represents a separate embodiment of the present invention.

In one embodiment, the cross-linking compound is photo-activated. Chemical cross-linking methods are known to those skilled in the art (see, for example, Hecht A et al, Methods Mol. Biol. (1999) 119:469-79; and Strutt H et al, Methods Mol. Biol. (1999) 119:455-67.

In one embodiment, the covalent bond is a reversible bond. In another embodiment, the covalent bond may be an irreversible bond. In another embodiment, a reversible bond is a bond capable of being broken or disrupted by exposure to heat, acid, base, or another means without destroying the remainder of the molecule.

In step (b) of the present invention, RNA molecules in the biological sample are cleaved (digested, broken, or fragmented) to obtain fragments of at least 22 bases in length (FIG. 1A, step 2). Step (b) may, in some embodiments, facilitate the selection of RBP-RNA complexes. In another embodiment, step (b) may facilitate identification of binding sites on RNA molecules that are isolated by this method.

In one embodiment, the digestion of step (b) generates a modified RBP-RNA complex of interest containing a fragment of the RNA molecule present in the original RBP-RNA complex. In one embodiment, step (c) and subsequent steps are performed on the modified RBP-RNA complex. In another embodiment, step (c) is performed prior to the modification performed in step (b). Each possibility represents a separate embodiment of the present invention.

In one embodiment, cleavage of the RNA molecules is performed by contact with an agent capable of breaking a bond of an RNA molecule. In one embodiment, the agent is to capable of breaking a bond of any RNA molecule in a sequence-specific manner. In another embodiment, the agent preferentially breaks bonds of RNA molecules having a particular sequence. In one embodiment, the bond is a phosphodiester bond.

In one embodiment, as depicted in FIG. 1A, the biological sample is a lysate at this point in the method. In the embodiment depicted, the fragments are obtained by digestion with limiting amounts of RNAse T1, which cleaves RNA molecules in a largely random fashion, leaving RNA fragments of approximately 100 bases. In the embodiment depicted, the lysate is then subjected to a high-speed centrifugation, which removes high molecular weight material from the cell. In one embodiment, the fragments average about 70-100 nucleotides in length.

Some of the RNA fragments obtained by step (b) of the present invention may contain the RBP binding site. In one embodiment, an RBP is bound to the binding site. It will be understood to one skilled in the art that, in one embodiment, the size of the RNA fragments may reflect the fact that an RBP is bound.

In one embodiment, the biological sample is treated with a DNAse prior to step (b). In another embodiment, the DNAse treatment may follow step (b). In another embodiment, the DNAse treatment is simultaneous with step (b). In another embodiment, any DNAse known in the art may be used. The use of such enzymes is well known to those skilled in the art, and is described, for example, in Molecular Cloning, (2001), Sambrook and Russell, eds. The use of each DNAse represents an additional embodiment of this invention.

In another embodiment, step (b) is carried out with a nuclease. In one embodiment, the nuclease is RNAse T1. T1 digestion yields —OH groups on the 5′ ends and 2′, 3′ cyclic phosphate groups on the 3′ ends of RNA fragments (FIG. 1A). In one embodiment, fragments generated by RNAse T1 digestion are suitable for labeling with T4 PNK and ³²P phosphate, as disclosed herein. In another embodiment, fragments generated by RNAse T1 digestion are suitable for directional ligation of nucleotide linkers onto the fragments. In another embodiment, fragments generated by RNAse T1 digestion are suitable for directional subcloning of the fragments into a vector.

In one embodiment, titrations of RNAse T1 are performed to ascertain dilutions which yield RNA CLIP fragments of the desired length. Techniques for titrating enzymes are known to those skilled in the art.

The covalent bond formed in step (a) enables, in one embodiment, the use of nucleases such as RNAse T1 in the CLIP method. In another embodiment, the covalent bond formed in step (a) enables fragmentation of the RNA (by a variety of means) without separating the RNA from its RBP-RNA complex. Fragmentation facilitates, in one embodiment, extrication and subsequent purification of RBP-RNA complexes that bound to ribosome and other cellular structures. Nucleases such as RNAse T1 in the CLIP method RNAse T1 cleaves RNA molecules in a relatively sequence non-specific fashion. Each of these properties of CLIP contribute to its ability to purify a representative sample of RNA molecules interacting with a given RBP.

In one embodiment, the nuclease is an endonuclease. In another embodiment, the nuclease is an exonuclease. In another embodiment, the nuclease is S1 Nuclease. In another embodiment, the nuclease is Mung Bean Nuclease. In another embodiment, the nuclease is Bal3I nuclease. In another embodiment, the nuclease is S1 nuclease. In another embodiment, the nuclease is T7 gene 6 exonuclease. In another embodiment, the nuclease is Exonuclease III. In another embodiment, the nuclease is the 3′-5′ exonuclease activity of a polymerase, such as T4 DNA polymerase, a Klenow fragment, and f1 gene product II or homologous enzymes from other filamentous bacteriophage (Meyer and Geider, J. Biol. Chem. 254:12636). In another embodiment, the nuclease is any nuclease known in the art.

A nuclease, according to one embodiment of the invention, also includes Saccharomyces cerevisiae RAD27, and Schizosaccharomyces pombe RAD2, Pol I DNA polymerase associated 5′ to 3′ exonuclease domain, (e.g. E. coli, Thermus aquaticus (Taq), Thermus flavus (Tfl), Bacillus caldotenax (Bca), Streptococcus pneumoniae) and phage functional homologues of FEN including but not limited to T5 5′ to 3′ exonuclease, T7 gene 6 exonuclease and T3 gene 6 exonuclease. The use of these nucleases is familiar to those skilled in the art (see, for example, Molecular Cloning, (2001), Sambrook and Russell, eds.). Each of these nucleases represents a separate embodiment of this invention.

In other embodiments, obtaining RNA fragments in step (b) is performed using a cleaving agent. In another embodiment, the cleaving agent is a single-stranded-specific endonucleases. In another embodiment, the cleaving agent is a double-stranded-specific endonuclease. In another embodiment, the cleaving agent is a chemical cleaving agent which preferentially cleaves single-stranded molecules. In another embodiment, the cleaving agent is a chemical cleaving agents which preferentially cleaves double-stranded molecules. In another embodiment, the cleaving agent is S1 Nuclease. In another embodiment, the cleaving agent is Mung Bean Nuclease. In another embodiment, the cleaving agent is potassium permanganate. In another embodiment, the cleaving agent is a cleaving agents, which cleaves double-stranded oligonucleotides in a random or pseudorandom way. In another embodiment, the cleaving agent is DNase I. The concentration and cutting time of the cleaving agent must, in one embodiment, be determined experimentally for each hybridization procedure. Each type of cleaving agent represents a separate embodiment of the present invention.

In one embodiment, step (b) is performed by fragmentation. The sequences can be, for example, either randomly fragmented or fragmented at specific sites in the nucleic acid sequence. Any known method of fragmentation may be employed in step (b), according to this embodiment. Various methods of fragmenting nucleic acids will be known to those of skill in the art. These methods may be, for example, either chemical or physical in nature.

In alternate embodiments, fragmentation may include partial degradation with a DNAse, RNAse, partial depurination with acid followed by heating, and restriction enzymes or other enzymes, which cleave nucleic acid at known or unknown locations. Physical fragmentation methods may involve subjecting the nucleic acid to a high shear rate. High shear rates are produced, in one embodiment, by moving nucleic acid through a chamber or channel with pits or spikes, or forcing the nucleic sample through a restricted size flow passage, e.g. an aperture having a cross sectional dimension in the micron or submicron scale. Each of these methods represents a separate embodiment of this invention.

In other embodiments of this invention, step (b) is performed by using radical-generating coordination complexes or with a syringe-operated silica micro-column Those of skill in the art will be familiar with methods of fragmenting RNA (see, for example, Current Protocols in Molecular Biology, (1998) Ausubel, et al, eds.), each of which represents another embodiment of this invention. In another embodiment, RNA is fragmented by heat and ion-mediated hydrolysis.

In another embodiment, step (b) is performed by physical or chemical means. The sequences can be randomly fragmented or fragmented at specific sites in the nucleic acid sequence. In another embodiment, step (b) is performed by breaking the nucleic acid in the biological sample. In another embodiment, step (b) is performed exposing it to harsh physical treatment (e.g., shearing or irradiation). In another embodiment, step (b) is performed or harsh chemical agents (e.g., by free radicals, including, but not limited to hydroxyl radicals; metal ions; acid treatment). The reaction conditions suitable for fragmenting nucleic acid molecules by physical or chemical methods are well known in the art. Furthermore, partial PCR extension, PCR stuttering, and other related methods for producing partial length copies of a parental sequence can be used to effect “fragmentation”, e.g., to obtain a hybrid product which contains segments derived from different parental sequences. Any method of fragmenting nucleic acid known in the art represents an additional embodiment of this invention.

As mentioned, in one embodiment fragmentation of the RNA molecules present in the RBP-RNA complexes render it suitable for subsequent subcloning. “Subcloning,” refers, in one embodiment, to inserting an oligonucleotide into a nucleotide molecule. In one embodiment, isolated DNA encoding an RNA transcript can be inserted into an appropriate expression vector that is suitable for the host cell to be employed such that the DNA is transcribed to produce the RNA. A large number of vector-host systems known in the art may be used in this embodiment. A vector may include, in some embodiments, an appropriate selectable marker. The vector may further include an origin of replication, and may be a shuttle vector, which can propagate both in bacteria, such as, for example, E. coli (wherein the vector comprises an appropriate selectable marker and origin of replication) and be compatible for propagation in vertebrate cells, or integration in the genome of an organism of choice. The vector according to this aspect of the present invention can be, for example, a plasmid, a bacmid, a phagemid, a cosmid, a phage, a modified or unmodified virus or an artificial chromosome. Many such vectors are commercially available, and their use is well known to those skilled in the art (see, for example, Molecular Cloning, (2001), Sambrook and Russell, eds.).

The insertion into a vector can, for example, be accomplished by ligating the DNA fragment into a vector which has complementary cohesive termini. However, if the complementary restriction sites used to fragment the DNA are not present in the cloning vector, the ends of the DNA molecules may be enzymatically modified. In another embodiment, any site desired may be produced by ligating nucleotide sequences (linkers) onto the DNA termini; these ligated linkers may comprise specific chemically synthesized oligonucleotides encoding restriction endonuclease recognition sequences.

In another embodiment, the nucleotide molecule into which the oligonucleotide is inserted may be a plasmid, cosmid, or the like, or a vector or strand of nucleic acid. In another embodiment, the nucleotide molecule is genetic material of a living organism, virus, phage, or material derived from a living organism, virus, or phage. In one embodiment, the nucleotide molecule is linear. In another embodiment, the nucleotide molecule is circular. In another embodiment, the nucleotide molecule is concatemerized. In another embodiment, the nucleotide molecule may be of any length. Each of these types of nucleotide molecules represents a separate embodiment of the invention. Methods for subcloning are known to those skilled in the art, and are described, for example in Molecular Cloning, (2001), Sambrook and Russell, eds.

In one embodiment, the biological sample is subjected to a centrifugal force at some point after step (a). In one embodiment, the centrifugation or settling step removes high-molecular weight (MW) material from the biological sample. In another embodiment, the centrifugation or settling step removes ribosomes from the biological sample. In another embodiment, the biological sample is allowed to settle. Each method represents a separate embodiment of the present invention.

In one embodiment, step (b) is performed before step (c). In another embodiment, step (b) is performed after part or all of step (c). In another embodiment, step (b) is performed concurrently with all or part of step (c).

In step (c) of the present invention, an RBP-RNA complex of interest is selected with a molecule that specifically interacts with a component of the RBP-RNA complex of interest (FIG. 1A, step 3). In one embodiment, as depicted in FIG. 1A, the method of selection is immunoprecipitation and the molecule that specifically interacts with a component of the RBP-RNA complex of interest is antisera directed against the protein of interest, in this case Nova-1 or Nova-2 protein. The immunoprecipitation method depicted in FIG. 1A comprises thorough washing of RBP-RNA complexes of interest, resulting in the selective removal of unwanted molecules. In the embodiment depicted in FIG. 1A, RNA molecules remaining after the washing are phosphorylated and labeled with γ-³²P ATP, allowing detection at a later step.

In one embodiment, the biological sample is pre-cleared prior to step (c). “Pre-clearing” comprises, in one embodiment, immunoprecipitating the sample using pre-immune serum or an irrelevant antibody, and is typically used to remove proteins and other substances that tend to stick non-specifically to other molecules (“sticky molecules”). In another embodiment, a similar method of removing sticky molecules is employed. In another embodiment, sticky molecules are “blocked”, or rendered less sticky, by the addition of a blocking agent such as milk powder to the solution used in step (c). Each such method represents a separate embodiment of the present invention.

In one embodiment, the component of the RBP-RNA complex of interest selected in step (c) is an RBP. In another embodiment, the component is an RNA-associated protein. In another embodiment, the component is a nucleic acid associated with the RBP-RNA complex. In another embodiment, the component of the RBP-RNA complex of interest selected in step (c) is an mRNA molecule associated with the RBP-RNA complex.

In another embodiment, the component is another molecule or compound (e.g., carbohydrate, lipid, vitamin, etc.) that associates with the RBP-RNA complex.

In one embodiment, the component of the RBP-RNA complex of interest selected in step (c) is Nova-1 protein, Nova-2 protein, or a combination thereof. According to this aspect of the invention, the Nova-1 and Nova-2 may include wild-type protein sequences, as well as other variants (including alleles) of the native protein sequence. In another embodiment, the component selected is a 55 KDa isoform of Nova-2. In another embodiment, the component selected is a 70 KDa isoform of Nova-2. Nova-1 and Nova-2 proteins are homologous to one another, and have similar or identical function (FIGS. 9-10).

In one embodiment, “variants” refers to proteins or genes that result from natural polymorphisms. In another embodiment, “variants” refers to proteins or genes that are synthesized by recombinant methodology. In another embodiment, “variants” refers to proteins or genes that differ from wild-type protein by one or more amino acid substitutions, insertions, deletions, or the like. As will be appreciated by those skilled in the art, a nucleotide sequence encoding a protein mentioned herein or a variant may differ from the known native sequences, due to codon degeneracies, nucleotide polymorphisms, or amino acid differences. Each of these represents an additional embodiment of the invention.

In another embodiment, the component of the RBP-RNA complex of interest selected in step (c) is an ELAV/Hu protein such as HuA, HuB, HuC, HuD or mHuR, or a combination thereof. Hu family proteins are RNA-binding proteins, antibodies against which are found in patients with small-cell lung carcinoma, are associated with sensory neuronopathy (PEM/SN), paraneoplastic cerebellar degeneration and MS. HuD, a neuronal Hu protein, is an RBP that is believed to shuttle between the nucleus and cytoplasm. In another embodiment, the component of the RBP-RNA complex of interest bound to in step (c) is a FXRP family protein such as FMRP, FXRP1, or FXRP2, or a combination thereof FMRP is a family of RBPs that are implicated in Fragile X Syndrome. In another embodiment, the component of the RBP-RNA complex of interest bound to in step (c) is Sjogren's Syndrome related antigen Ro (SS-A), Sjogren's Syndrome related antigen La (SS-B), or a protein belonging of the ribonuclear proteins (RNP) family, or a combination thereof. SS-A, SS-B, and RNP proteins are antigens that have been implicated in autoimmune disorders such as SLE, Sjogren's Syndrome, JRA, and HAM/TSP. In another embodiment, the component of the RBP-RNA complex of interest bound to in step (c) is calreticulin. Calreticulin is an RBP that has been implicated in Sjogren's syndrome, PBC, autoimmune hepatitis type 1, MS, coeliac disease, and yersinosis. In another embodiment, the component of the RBP-RNA complex of interest bound to in step (c) is SMN protein, a protein belonging to the CELF proteins family, or SCA1 protein, or a combination thereof. In another embodiment, the component of the RBP-RNA complex of interest bound to in step (c) is SF2/ASF. SF2/ASF is a protein that has been implicated in spinal muscular atrophy. In another embodiment, the component of the RBP-RNA complex of interest bound to in step (c) is a small nucleolar ribonucleoprotein complex (snoRNP). SnoRNP have been implicated in SLE and SSc. In another embodiment, the component of the RBP-RNA complex of interest bound to in step (c) is heterogeneous nuclear ribonuclear protein-A1 (hnRNP-A1). hnRNA-A1 has been implicated in HAM/TSP.

The term “homology”, as used herein, when in reference to any protein or peptide, may indicate, in one embodiment, a percentage of amino acid residues in the candidate sequence that are identical with the residues of a corresponding native polypeptide, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent homology, and not considering any conservative substitutions as part of the sequence identity. In another embodiment, conservative substitutions are considered as part of sequence identity when determining homology. Neither N- or C-terminal extensions nor insertions shall be construed as reducing identity or homology. Methods and computer programs for the alignment are well known in the art.

In one embodiment, “corresponding” refers to identity of greater than 70%. In another embodiment, “corresponding” refers to identity of greater than 75%. In another embodiment, “corresponding” refers to identity of greater than 80%. In another embodiment, “corresponding” refers to identity of greater than 85%. In another embodiment, “corresponding” refers to identity of greater than 90%. In another embodiment, “corresponding” refers to identity of greater than 95%. In another embodiment, “corresponding” refers to identity of greater than 97%. In another embodiment, “corresponding” refers to identity of greater than 98%. In another embodiment, “corresponding” refers to identity of greater than 99%. In another embodiment, “corresponding” refers to identity of 100%.

The term “homology”, as used herein, when in reference to any nucleic acid sequence similarly may indicate, in one embodiment, a percentage of nucleotides in a candidate sequence that are identical with the nucleotides of a corresponding native nucleic acid sequence.

Homology may be determined in the latter case by computer algorithm for sequence alignment, by methods well described in the art. For example, computer algorithm analysis of nucleic acid sequence homology may include the utilization of any Number of software packages available, such as, for example, the BLAST, DOMAIN, BEAUTY (BLAST Enhanced Alignment Utility), GENPEPT and TREMBL packages.

An additional means of determining homology is via determination of candidate sequence hybridization, methods of which are well described in the art (See, for example, “Nucleic Acid Hybridization” Hames, B. D., and Higgins S. J., Eds. (1985); Molecular Cloning, (2001), Sambrook and Russell, eds.; and Current Protocols in Molecular Biology, (1998) Ausubel, et al, eds.). In one embodiment, methods of hybridization are carried out under moderate to stringent conditions, to the complement of a DNA encoding a native peptide derived from Nova-1, Nova-2, an ELAV/Hu protein, FXRP, SS-A, SS-B, an RNP protein, calreticulin, SMN protein, a protein belonging to the CELF proteins family, or SCA1 protein. Hybridization conditions being, for example, overnight incubation at 42° C. in a solution comprising: 10-20% formamide, 5×SSC (150 millimolar (mM) NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5×Denhardt's solution, 10% dextran sulfate, and 20 μg/ml denatured, sheared salmon sperm DNA.

Protein and/or peptide homology for any amino acid sequence listed herein are, in one embodiment, determined by methods well described in the art, including immunoblot analysis, or via computer algorithm analysis of amino acid sequences, utilizing any of a Number of software packages available, via established methods. Some of these packages may include the FASTA, BLAST, MPsrch or Scanps packages, and may employ the use of the Smith and Waterman algorithms, and/or global/local or BLOCKS alignments for analysis, for example. In another embodiment, the web site clustalw is used for analysis (FIG. 8B). The use of each of these methods is known to those skilled in the art. Each method for determining homology represents an additional embodiment of the present invention.

In one embodiment, step (c) comprises selecting the RBP-RNA complex by immunoprecipitation (IP). In another embodiment, the RBP-RNA complex is selected by magnetic separation. In one embodiment, the IP is performed in a similar manner to one of the embodiments described in Example 1.

The term “IP” herein, refers, in one embodiment, to a technique for selecting a molecule of interest from a biological sample. Briefly, the biological sample is contacted with a molecule that interacts with the molecule of interest and attaching or adhering the molecule that interacts with the molecule of interest to a substrate. IP may include a step of washing the substrate to remove impurities. IP may, in one embodiment, comprise protein A/sepharose beads. In another embodiment, IP may comprise protein G/sepharose beads. In one embodiment, IP comprises magnetic beads such as Dynabeads. In another embodiment, IP may comprise any type of solid support, such as any type of bead, plate, column, a fiber, or an array. The molecule that specifically interacts with a component of the RBP-RNA complex of interest may be attached, in one embodiment, to the substrate using any known method, including chemical or physical attachment in some embodiments, as known in the art. Techniques for performing IP are known to those skilled in the art (see, for example, Current Protocols in Molecular Biology, (1998) Ausubel, et al, eds.) Each such method represents a separate embodiment of the present invention.

In one embodiment, step (c) comprises solid phase absorption using calcium phosphate gel or hydroxyapatite, or solid phase binding. Solid phase binding is performed, in one embodiment, through ionic bonding, with either an anion exchanger, such as diethylaminoethyl (DEAE), or diethyl [2-hydroxypropyl]aminoethyl (QAE) SEPHADEX or cellulose; or with a cation exchanger such as carboxymethyl (CM) or sulfopropyl (SP) SEPHADEX or cellulose. Alternative means of solid phase binding include the exploitation of hydrophobic interactions e.g., the using of a solid support such as phenyl-SEPHAROSE and a high salt buffer; affinity-binding, using, e.g., placing a specific DNA binding site of a Stat protein to an activated support; immuno-binding, using e.g., an antibody to the Stat protein bound to an activated support; as well as other solid phase supports including those that contain specific dyes or lectins etc. A further solid phase support technique that is often used at the end of the purification procedure relies on size exclusion, such as SEPHADEX and SEPHAROSE gels, or pressurized or centrifugal membrane techniques, using size exclusion membrane filters. Each of these methods represents a separate embodiment of the invention.

In another embodiment, a silaceous or silane-containing substrate such as, for example, glass, porous silica, or oxidized silicon materials is used as a solid support. This technique may be effected by any method well known to one skilled in the art, such as, for example, the method cited in U.S. Pat. No. 6,426,183.

In one embodiment, selecting the RBP-RNA complex of interest in step (c) comprises a separation step. In one embodiment, solid phase support separations are generally performed batch-wise with low-speed centrifugation or by column chromatography. In another embodiment, magnetic separation methods such as Dynabeads are used. In another embodiment, liquid chromatography separation is performed, using, for example, high performance liquid chromatography (HPLC), including such related techniques as FPLC. In another embodiment, size exclusion techniques may also be accomplished with the aid of low speed centrifugation. In addition, size permeation techniques such as gel electrophoretic techniques may be employed for separation. These techniques are generally performed in tubes, slabs or by capillary electrophoresis. Each of these methods represents a separate embodiment of the invention.

In one embodiment, the molecule that specifically interacts with a component of the RBP-RNA complex of interest of step (c) is an antibody that specifically binds the component. In another embodiment, the molecule is a nucleic acid that binds the component (e.g., an antisense molecule, an RNA molecule that binds the component). In another embodiment, the molecule is any other compound or molecule that binds a component of the complex. Each type of molecule represents a separate embodiment of the present invention.

The term “antibody” refers, in one embodiment, to an antiserum. In another embodiment, “antibody” refers to a purified antibody. In another embodiment, “antibody” refers to a modification of a purified antibody. In another embodiment, the antibody is polyclonal. In another embodiment, the antibody is monoclonal. Each type of antibody represents a separate embodiment of the present invention.

In one embodiment, the molecule that specifically interacts with a component of the RBP-RNA complex of interest binds the component directly (e.g., may be, in one embodiment, an antibody specific for the component), or binds the component indirectly (e.g., may be, in one embodiment, an antibody or binding partner for a tag on the component). The molecule that specifically interacts with a component of the RBP-RNA complex of interest of step (c) is attached, in one embodiment, to a solid support, such as a bead, plate, a column, a fiber, or an array. The molecule that specifically interacts with a component of the RBP-RNA complex of interest may be attached to the solid support using any known method, including chemical or physical attachment in some embodiments, as known in the art. Each molecule represents a separate embodiment of the present invention.

In another embodiment, the component is modified with a selectable element, the properties of which may then be exploited in order to remove the RBP-RNA complex from the mixed population. Non-limiting examples of selectable elements include: nucleic acid sequences, ligands, receptors, antibodies, hapten groups, antigens, biotin, streptavidin, enzymes and enzyme inhibitors. Once a component containing a selectable element is complexed to the target sequence, the RBP-RNA complex is exposed to a reagent capable of binding the selectable element and the RBP-RNA complex is removed from the mixed population. In another embodiment, glutathione-S-transferase/protease fusion proteins can be adsorbed onto glutathione sepharose beads. Each of these methods represents a separate embodiment of the present invention.

The term “biotin” herein includes, in one embodiment, any of the biotin derivatives that are described in the art. See, for example, U.S. Pat. No. 6,613,516 and references cited therein.

In another embodiment, the molecule that specifically interacts with a component of the RBP-RNA complex of interest is bound by a secondary binding molecule. The secondary binding molecule may bind the molecule that specifically interacts with a component of the RBP-RNA complex of interest directly or indirectly. Examples of direct-binding secondary molecules comprise antibodies. For example, an RBP-RNA complex of interest is bound by a primary antibody, and an antibody recognizing the immunoglobulin chain of the primary antibody is then used to select the bound complex. Examples of indirect-binding secondary molecules comprise antibodies or binding partners for a tag on the molecule that specifically interacts with a component of the RBP-RNA complex of interest. For example, an RBP-RNA complex of interest is bound by a primary antibody that contains an epitope tag, and an antibody recognizing the epitope tag of the primary antibody is then used to select the bound complex. Accordingly, in the aforementioned embodiments, the RBP-RNA complex is attached to the solid support via the ligand and binding molecule.

Alternatively, the molecule that specifically interacts with a component of the RBP-RNA complex of interest can be modified with a selectable element, the properties of which may then be exploited in order to remove the RBP-RNA complex from the mixed population. Non-limiting examples of selectable elements include: nucleic acid sequences, ligands, receptors, antibodies, hapten groups, antigens, biotin, streptavidin, enzymes and enzyme inhibitors. Once a molecule that specifically interacts with a component of the RBP-RNA complex of interest containing a selectable element is complexed to the RBP-RNA complex, the biological sample is exposed to a reagent capable of binding the selectable element and the RBP-RNA complex is removed from the mixed population.

The RBP-RNA complex of interest is selected in step (c), in one embodiment, by removing it from the solid support (i.e., the complex is washed off the solid support using suitable conditions and solvents). In another embodiment, the RBP-RNA complex of interest may remain on the solid support.

A variety of epitopes may be used to tag a protein, while retaining at least part of the biological activity of the unmodified protein. Such epitopes may be naturally-occurring amino acid sequences found in nature, artificially constructed sequences, or modified natural sequences. Recently, a variety of artificial epitope sequences have been described that have been shown to be useful for tagging and detecting recombinant proteins. In one embodiment, an artificial epitope sequence with the eight amino acid FLAG marker peptide (Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys) (SEQ ID No 470), has been useful for detection as well as affinity purification of recombinant proteins, with antibodies recognizing the epitope readily available (Brewer et al Bioprocess Technol. 2:239-266; Kunz et al J. Biol. Chem. 267:9101-9106).

Additional artificial epitope tags include an improved FLAG tag having the sequence Asp-Tyr-Lys-Asp-Glu-Asp-Asp-Lys (SEQ ID No 471), a nine amino acid peptide sequence Ala-Trp-Arg-His-Pro-Gln-Phe-Gly-Gly (SEQ ID No 472) referred to as the “Strep tag” (Schmidt et al, J. Chromatography 676:337-345), poly-histidine sequences, e.g., a poly-His of six residues which is sufficient for binding to IMAC beads, an eleven amino acid sequence from human c-myc recognized by monoclonal antibody 9E10, or an epitope represented by the sequence Tyr-Pro-Tyr-Asp-Val-Pro-Asp-Tyr-Ala-Ile-Glu-Gly-Arg (SEQ ID No 473) derived from an influenza virus hemagglutinin (HA) subtype, recognized by the monoclonal antibody 12CA5. Also, the Glu-Glu-Phe sequence recognized by the anti-alpha-tubulin monoclonal antibody YL1/2 has been used as an affinity tag for purification of recombinant proteins (Stammers et al., FEBS Lett. 283:298-302).

Another commonly used artificial epitope is a poly-His sequence having six histidine residues (His-His-His-His-His-His) (SEQ ID No 474). Naturally occurring epitopes include the eleven amino acid sequence from human c-myc recognized by the monoclonal antibody 9E10 (Glu-Gln-Lys-Leu-Leu-Ser-Glu-Glu-Asp-Leu-Asn) (SEQ ID No 475) (Manstein et al. (1995) Gene 162:129-134). Another useful epitope is the tripeptide Glu-Glu-Phe (SEQ ID No 476) which is recognized by the monoclonal antibody YL 1/2 against alpha-tubulin. This tripeptide has been used as an affinity tag for the purification of recombinant proteins.

In one embodiment, selecting the RBP-RNA complex in step (c) is performed in the presence of an agent capable of disrupting non-covalent interactions. In one embodiment, the agent is a detergent. In one embodiment, the detergent is ionic. In another embodiment, the detergent is non-ionic. In certain embodiments, the detergent is selected from sodium dodecyl sulfate (SDS) and sodium deoxycholate. In certain other embodiments, the detergent is NP-40, tergitol, Tween 20, Saponin, or triton X-100. Disruption of non-covalent interactions may also be achieved using aprotic solvents such as dimethyl sulfoxide and hexamethylphosphoramide. The use of detergents and other agents to disrupt non-covalent interactions is well known in the art (see, for example Molecular Cloning, (2001), Sambrook and Russell, eds.; Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds.; and Current Protocols in Molecular Biology, (1998) Ausubel, et al, eds.). Each of the various agents that disrupts non-covalent interactions that are described in the art represents a separate embodiment of this invention.

“Intermolecular” and “non-covalent” are, in one embodiment, interchangeable terms that refer to bonds between biological macromolecules. In one embodiment, “intermolecular” refers to bonds between components of a complex such as an RBP-RNA complex. In one embodiment, the agent utilized in step (d) only disrupt certain types of intermolecular bonds. Each possibility represents a separate embodiment of the present invention.

In one embodiment, step (c) is performed with a buffer. The use of buffers is well known to those skilled in the art. Typical buffers can be purchased from most biochemical catalogues and include the classical buffers such as Tris, pyrophosphate, monophosphate and diphosphate. A number of references (Current Protocols in Molecular Biology, (1998) Ausubel, et al, eds.; Good, N. E., et al., Biochemistry, 5, 467; Good, N. E. and Izawa, S., Meth. Enzymol., 24:3; and Fergunson, W. J. and Good, N. E., Anal. Biochem. 104:300) describe the use of pH buffers such as Mes, Hepes, Mops, tricine and Ches. Buffers described in the literature that may be useful may be referred to as SDS buffer, lysis buffer, RIPA buffer, RIP buffer, IP buffer, washing buffer, binding buffer, storage buffer, and blocking buffer. Buffer may contain, for example, detergents, pH buffering agents, salts, blocking agents, ion chelators, preservatives, such as, for example, sodium azide, dyes, glycerol, protease inhibitor, phosphatase inhibitors, among other substances. In one embodiment, step (c) comprises the use of a high-salt buffer. In one embodiment, a high-salt buffer has an osmolarity greater than physiologic osmolarity. In another embodiment, a high salt buffer has a salt content greater than physiological salt content. Methods for calculating the osmolarity and salt content of a buffer are well known to those skilled in the art. Each buffer disclosed in the art represents a separate embodiment of this invention.

In one embodiment, the RBP-RNA complex of interest is first washed with a buffer containing an agent that disrupts non-covalent interactions, and is subsequently washed with a high-salt buffer. In another embodiment, the RBP-RNA complex of interest is first washed with a high-salt buffer, and is subsequently washed with a buffer containing an agent that disrupts non-covalent interactions. In one embodiment, the high salt buffer also contains an agent that disrupts non-covalent interactions. In another embodiment, the high salt buffer does not contain an agent that disrupts non-covalent interactions. The high salt buffer may also contain any of the other buffer components listed hereinabove. Each of these techniques represents a separate embodiment of the present invention.

In another embodiment, RNA in the RBP-RNA complexes selected in step (c) is labeled. In one embodiment, the labeling comprises the use of gamma-³²P ATP and T4 polynucleotide kinase (PNK). T4 PNK leaves a 5′ phosphate on RNA molecules. Also, T4 PNK has a “resolving” activity that opens up a 2′, 3′ cyclic phosphate at the 3′ end of the RNA fragments (for example, as remains after T1 RNAse digestion) to yield a fraction of molecules with free 3′-OH (FIG. 1A, step 3) (Walker et al, PNAS 72:122-6). In one embodiment, fragments treated with T4 PNK are suitable for directional ligation of nucleotide linkers onto the fragments. For example, a linker containing a 5′-OH and a 3′-OH group may only be able to be coupled to the 5′ phosphorylated end of the RNA fragment. By contrast, a nucleotide linker containing a 5′ phosphate and a 3′ end blocked with puromycin, may only be able to be linked to 3′ end of the RNA fragment (FIG. 1A, step 7). In another embodiment, fragments treated with T4 PNK are suitable for directional subcloning of the fragments into a vector.

In one embodiment, the sample is labeled with gamma-³²P ATP. Other detectable signal moieties suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), fluorescent dyes (e.g., fluorescein, Texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3H, 125I, 35S, 14C, or 32P), enzymes (e.g., horseradish peroxidase, alkaline phosphatase and others commonly used in an ELISA), colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads, and electron-dense labels such as gold, silver, lead and other metals. Methods employing the use of such labels are described in, for example, U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241 and references cited therein, or others well known to one skilled in the art. The use of each of these methods represents a separate embodiment of the present invention.

In one embodiment of the present invention, a detectable signal moiety is then reacted with the modified or unmodified 5′ end of the fragments to produce labeled fragments. For example, a biotin group such as PEO-Iodoacetyl Biotin may be conjugated to 5′-ends of the fragments which have been modified by T4 polynucleotide kinase and gamma-S-ATP. In one such embodiment, the label is supplied to the nucleic acid by the addition of oxide biotinyl-iodacetamidyl-3,6-dioxaoctanediamine (Iodoacetyl Biotin), for example, by the addition of polyethylene oxide biotinyl-iodacetamidyl-3,6-dioxaoctanediamine (PEO-Iodoacetyl Biotin). PEO-Iodoacetyl Biotin (Pierce Chemical Co. Product #213341ZZ) is a long-chain, water-soluble, sulfhydryl (—SH)-reactive biotinylation reagent. The PEO spacer arm imparts high water solubility. Iodoacetyl Biotin (Pierce Chemical Co. Product #21333ZZ) is generally dissolved in DMSO or DMF before use. The iodoacetyl functional group reacts predominantly with free —SH groups. The reaction occurs by nucleophilic substitution of iodine with a thiol group, resulting in a stable thio-ether bond. The use of PEO-Iodoacetyl Biotin as a biotinylation reagent for proteins and antibodies has been described previously. See, for example, Instructions for EZ-Link™ PEO-Iodoacetyl Biotin, Pierce Chemical Co. PEO-Iodoacetyl Biotin is also a suitable label for nucleic acids. The use of Iodoacetyl Biotin as a biotinylation reagent for antibodies is described in, for example, U.S. Pat. No. 5,137,804. The use of Iodoacetyl Biotin as a label for the enzyme kinase is described in, for example, (Jeong et al. Kinase “Assay Based on Thiophosphorylation and Biotinylation,” Biotechniques 27:1232-1238 (December 1999)). We have also found that PEO-Iodoacetyl Biotin can be conjugated to a nucleic acid fragment without 5′ modification. The use of each of these methods represents a separate embodiment of the present invention.

Means of detecting such labels are well known to those of skill in the art. In one embodiment, radiolabels are detected using photographic film or scintillation counters, fluorescent markers may be detected using a photodetector to detect emitted light. Enzymatic labels are, in one embodiment, detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and colorimetric labels are detected by simply visualizing the colored label. Colloidal gold and other electron-dense labels can be detected by measuring scattered light. Each of these techniques represents a separate embodiment of the present invention.

Methods for labeling the 5′ end of an oligonucleotide, which may be used in other embodiments, include, but are not limited to, the following: (i) periodate oxidation of a 5′-to-5′-coupled ribonucleotide, followed by reaction with an amine-reactive label (Heller & Morisson (1985) in Rapid Detection and Identification of Infectious Agents, D. T. Kingsbury and S. Falkow, eds., pp 245-256, Academic Press); (ii) condensation of ethylenediamine with 5′-phosphorylated polynucleotide, followed by reaction with an amine reactive label (Morrison, European Patent Application 232 967 and references cited therein); and (iii) introduction of an aliphatic amine substituent using an aminohexyl phosphite reagent in solid-phase DNA synthesis, followed by reaction with an amine reactive label (Cardullo et al. (1988) Proc. Natl. Acad. Sci. USA, 85: 8790-8794).

In step (d) of the present invention, the RBP-RNA complex of interest is purified under stringent conditions (FIG. 1A, steps 4-5). In one embodiment, as depicted in FIG. 1A, the stringent conditions consist of SDS-PAGE, transfer to a nitrocellulose (NC) filter, and digestions with Proteinase K. SDS-PAGE separates covalently linked RBP-RNA complexes from free RNA and RBP-RNA complexes that are not covalently linked. The radioactive label of this embodiment allows visualization of these different populations of RNA, and of a band containing the RBP-RNA complexes of interest (FIG. 1B-C). Transfer of the material in the gel to the NC filter constitutes a further purification step, as NC binds protein-bound but not free RNA. Digestion of the protein with Proteinase K results in extraction from the NC filter of RNA molecules of the RBP-RNA complexes.

In one embodiment, “purification” refers to removal of the RNA molecule of interest from the RBP-RNA complex. In one embodiment, this removal comprises digestion of one or more other components of the RBP-RNA complex. In another embodiment, “purification” refers to purification of the RNA molecule of interest together with one or more components of the RBP-RNA complex.

In one embodiment, the purification of step (d) is performed in the presence of an agent capable of disrupting non-covalent interactions as disclosed herein. In another embodiment, the RBP-RNA complex of interest is heated in the presence of a buffer as part of step (d). In one embodiment, the biological sample is heated to a temperature of about 100° Celsius (C). In one embodiment, the biological sample is heated to a temperature greater than about 25° C. In one embodiment, the biological sample is heated to a temperature greater than about 4° C. Each of these techniques represents a separate embodiment of the present invention.

In one embodiment, the RBP-RNA complex is purified by a chromatographic method. In another embodiment, the purification may utilize other methods known in the art.

Other embodiments of techniques which can be applied or combined to purify the RBP-RNA complex of interest comprise chemical extraction, such as phenol or chloroform extract, dialysis, precipitation such as ammonium sulfate cuts, electrophoresis, and chromatographic techniques. In another embodiment, chemical isolation techniques are used for removal of bulk quantities of non-proteinaceous material, and may therefore be used for purifying an RBP-RNA complex of interest. Electrophoretic separation involves placing the biological sample into wells of a gel. In one embodiment, the gel is a denaturing gel. In another embodiment, the gel is a non-denaturing gel. In another embodiment, the gel is a polyacrylamide gel. In another embodiment, the gel is an agarose gel. Direct or pulsed current is applied to the gel and the various components of the system separate according to molecular size, configuration, charge or a combination of their physical properties. Methods for the purification of protein from acrylamide and agarose gels are known and commercially available. Each of these techniques represents a separate embodiment of the current invention.

In one embodiment, the chromatographic method is performed in the presence of an agent capable of disrupting non-covalent interactions. In one embodiment, the agent is a detergent. In one embodiment, the detergent is ionic. In yet another embodiment, the detergent is non-ionic. In certain embodiments, the detergent is selected from sodium dodecyl sulfate (SDS) and sodium deoxycholate. In certain other embodiments, the non-ionic detergent is NP-40, tergitol, Tween 20, or triton X-100. Disruption of non-covalent interactions may also be achieved using aprotic solvents such as dimethyl sulfoxide and hexamethylphosphoramide. The use of detergents and other agents to disrupt non-covalent interactions is well known in the art (see, for example Molecular Cloning, (2001), Sambrook and Russell, eds.; and Current Protocols in Molecular Biology, (1998) Ausubel, et al, eds.). Each of the agents that disrupts non-covalent interactions represents a separate embodiment of this invention.

In one embodiment, the chromatographic method is gel filtration. In another embodiment, the chromatographic method is fast-pressure liquid chromatography. In another embodiment, the chromatographic method is high-pressure liquid chromatography. In another embodiment, the chromatographic method is reverse-phase chromatography. In another embodiment, the chromatographic method is affinity chromatography. In another embodiment, the chromatographic method is ion exchange chromatography.

The chromatographic method utilizes, in one embodiment, a gel. In another embodiment, the chromatographic method utilizes a column. In another embodiment, the chromatographic methods utilizes a liquid phase apparatus. In another embodiment, the chromatographic methods utilizes a thin-layer apparatus. In another embodiment, the chromatographic methods utilizes any other chromatographic method known in the art. Each type of chromatographic method represents a separate embodiment of the current invention.

In one embodiment, the gel utilized in the chromatographic method may comprise acrylamide. In another embodiment, the gel utilized may comprise agarose. In another embodiment, the gel utilized may comprise any other matrix constituent known in the art. The use of such constituents is well known to those skilled in the art. Each of these techniques represents a separate embodiment of the current invention.

In another embodiment, the chromatographic method is performed in the presence of a pH buffering agent. In another embodiment, the pH buffering agent prevents or reduces alkalinization. In one embodiment, purifying the RBP-RNA complex of interest in step (d) is performed in the presence of a reducing agent. In another embodiment, step (d) is performed in the absence of reducing agent. Each of these techniques represents a separate embodiment of the current invention.

In one embodiment of the current invention, purifying the RBP-RNA complex of interest in step (d) comprises transferring the RBP-RNA complex of interest to a substrate. In one embodiment, the substrate is a membrane. In one embodiment, the substrate is composed of, for example, NC (Example 1), nylon (Stahl et al., Appl. Environ. Microbiol., 54:1079-1084), a silaceous or silane-containing substrate such as for example, glass, porous silica, or oxidized silicon materials, (U.S. Pat. No. 6,426,183 and references cited therein) silica gel, glass fibers, quartz fibers, and zeolites (U.S. Pat. No. 6,383,393 and references cited therein), or any other substrate known in the art, as described, for example in Molecular Cloning, (2001), Sambrook and Russell, eds.; Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds.; or in Current Protocols in Molecular Biology (1998) Ausubel, et al, eds.). Each of these techniques represents a separate embodiment of the current invention.

In one embodiment, the substrate may preferentially bind RNA covalently bound to protein over RNA not covalently bound to protein. In another embodiment, the substrate may exclusively bind RNA covalently bound to protein over RNA not covalently bound to protein.

In one embodiment, the RBP-RNA complex of interest is transferred to the substrate by electrophoresis. The use of electrophoresis is well known to those skilled in the art. In some embodiments, the RBP-RNA complex of interest is transferred to the substrate using, for example, electro-blotting, capillary electrophoresis, positive pressure blotting, vacuum blotting, direct blotting, mechanical blotting, or, for example, any of the methods described in U.S. Pat. No. 6,602,391 and references cited therein. Each of these techniques represents a separate embodiment of the current invention. In another embodiment, the RBP-RNA complex of interest is transferred to the substrate using the methods described in U.S. Pat. No. 6,383,393 and references cited therein. Each of these methods represents a separate embodiment of the present invention.

In one embodiment, the transfer of the RBP-RNA complex of interest to a substrate may involve a semipermeable membrane. In another embodiment, the transfer may involve a micro-porous composites or a micro-porous membrane. In one embodiment, the transfer apparatus may be vertical. In another embodiment, the transfer apparatus may be horizontal. Each of these methods represents a separate embodiment of the present invention.

In one embodiment, purifying the RBP-RNA complex of interest in step (d) comprises eluting the RBP-RNA complex from the substrate. In one embodiment, the elution comprises physically removing the section of the substrate containing the RBP-RNA complex of interest. In one embodiment, the elution comprises, for example, digestion with Proteinase K or a homologous enzyme. Proteinase K is capable of efficiently digesting protein in an RBP-RNA complex, liberating RNA in the complex from a substrate and yielding products that can be used for ligation and amplification (FIG. 2).

In another embodiment, the RBP-RNA complex of interest is eluted from the substrate by digestion with a member of one of the following classes of proteases or their homologues: Aspartyl proteases, caspases, thiol proteases, Insulinase family proteases, zinc binding proteases, Cytosol Aminopeptidase family proteases, Zinc carboxypeptidases Neutral Zinc Metallopeptidases, extracellular matrix metalloproteinases, matrixins, Prolyl oligopeptidases, Aminopeptidases, Proline Dipeptidases, Methionine aminopeptidases, Serine Carboxypeptidases, Cathepsins, Subtilases, Proteasome A-type Proteases, Proteosome B-type Proteases, Trypsin Family Serine Proteases, Subtilase Family Serine Proteases, Peptidases, Ubiquitin carboxyl-terminal hydrolases, or other proteases described in U.S. Pat. No. 6,395,889 and references cited therein. In another embodiment, the elution comprises the methods described in U.S. Pat. No. 6,383,393 and references cited therein. A number of these proteases are commercially available. The use of these proteases is known to those skilled in the art, and is described in, for example, Lundell et al (Anal Biochem 266: 31-47) and product literature from Roche and Sigma-Aldrich. Each of these techniques represents a separate embodiment of the current invention.

In one embodiment, RNA molecules from RNA-protein complexes of interest are analyzed. In one embodiment, the analysis may comprise electrophoresis. In one embodiment, the electrophoresis may be carried out with SDS-PAGE (FIG. 1A, step 6 and FIG. 2). SDS-PAGE electrophoresis reveals the approximate size of nucleic acid molecules. This technique is well known to those skilled in the art, and is described in, for example, Molecular Cloning, (2001), Sambrook and Russell, eds. In another embodiment, RNA from the RBP-RNA complex of interest is size purified. The analysis or size purification may utilize any chromatography method described herein, or any other method described in the art. Each such method represents a separate embodiment of the current invention.

Size purification is, in one embodiment, an effective method of following the various steps involved in amplifying and identifying RNA molecules of interest. In another embodiment, size purification may increase the purity of products obtained (FIG. 1A, step 6 and FIG. 2).

In one embodiment (depicted in FIG. 1A), RNA molecules from the RBP-RNA complexes of interest are then identified. Identification is accomplished, according to this aspect, by linker ligation, reverse transcriptase-polymerase chain reaction (RT-PCR), which amplifies the RNA molecules, ligation into a vector, and sequencing (step 7). In another embodiment, the RBP-RNA complex of interest is not identified. Analysis of RNA molecules (step 6) may also be useful at one or more stages in the steps comprising the identification process, as described herein. In one embodiment, this method is used to identify RNA molecules that affect gene silencing or post-transcriptional regulation of gene expression.

In one embodiment of the current invention, nucleotide linkers are ligated to an RNA molecule in the RBP-RNA complex of interest. “Ligation”, in all the applications described herein, refers, in one embodiment, to attaching an end of a nucleotide molecule to another end of a nucleotide molecule. The two ends that are joined may be from separate molecules or the same molecule. Ligation of nucleotide linkers facilitates subsequent amplification (FIG. 1A, step 7). In one embodiment, the ligation is performed with T4 RNA ligase or a homologous enzyme. Alternately, the ligation can be performed by any RNA ligase known in the art. Methods of ligation and the use of various ligases and methods for using them are described in, for example, Molecular Cloning, (2001), Sambrook and Russell, eds. or Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds. Each such ligase represents an additional embodiment of the current invention.

In another embodiment, there are provided nucleotide linkers comprising a nucleic acid having a sequence selected from the group consisting of: 5′ P-CGACCUGCAGGCUUCCUGC-puromycin (SEQ ID No 487); 5′ OH-CUUAGGUGGAAGGGCAAGCG-OH 3′ (SEQ II) No 488); 5′ P-GGG CAACAGGUACCAAACUC-puromycin (SEQ ID No 489); 5′ OH-CUUAGGUGGUACCGCAAGCG-OH 3′ (SEQ ID No 490); 5′ P-PGGGCAACAGUAGAUAAACUC-puromycin (SEQ ID No 491); 5′-OH TCGGGCGAGTCGTCTG-OH 3′ (SEQ ID No 483); 5′-P CCGCATCGTCCTCCC puromycin) (SEQ ID No 484); 5′-TCGGGCGAGTCGTCTG (SEQ ID No 485); and GGGAGGACGATGCGG (SEQ ID No 486); 3′ link RNA (5′-P CAG ACG ACG AGC GGG A 3′-puromycin) (SEQ ID No 478); GL5 DNA (AGG GAG GAC GAT GCG G) (SEQ ID No 479); GL3 DNA (TCC CGC TCG TCG TCT G) (SEQ ID No 480); 5CLIPcIoneNotI (CAGTGCTGCGCGGCCGCAGGGAGGACGATGCGG) (SEQ ID No 481); 3CLIPcloneAscI (TCAAGTCAGGGCGCGCCTCCCGCTCGTCGTCTG) (SEQ ID No 482); and a sequence as set forth in (SEQ ID No 477-486 and 497-502). In one embodiment, the linkers may be useful in methods comprising ligation to a nucleotide molecule. In one embodiment (depicted in FIGS. 1 and 2), the linkers may be directionally ligated to the ends of RNA fragments to facilitate cDNA synthesis and RT-PCR. In another embodiment, the linkers may be used in a non-directional manner. In another embodiment, the linkers may be used to facilitate a different later step other than cDNA synthesis or RT-PCR.

In one embodiment of the current invention, the nucleotide linkers are directionally oriented. In another embodiment of the current invention, the nucleotide linkers are not directionally oriented.

The term “nucleotide linker”, in one embodiment, refers to an oligonucleotide ligated onto the end of an RNA fragment. In another embodiment, “nucleotide linker” refers to an oligonucleotides used as primers in a polymerase chain reaction (PCR) reaction. In one embodiment, the nucleotide linkers are suitable for directional ligation. In one embodiment, the nucleotide linkers are suitable for PCR amplification. In another embodiment, the nucleotide linkers are suitable for subcloning.

In one embodiment, the nucleotide linkers comprise RNA or a derivative thereof. In one embodiment, the nucleotide linkers comprise DNA or a derivative thereof. In one embodiment, the nucleotide linkers comprise any other nucleic acid or nucleic acid derivative disclosed herein.

In another embodiment of the current invention, there is provided a method of using linkers comprising a sequence as set forth in SEQ ID No 477-502 to amplify a nucleic acid molecule. The method may comprise, in one embodiment, the following steps: (a) ligating the nucleic acid molecule to one or more of the linkers; (b) synthesizing a nucleic acid strand complementary to a product of step (a) using one or more of the linkers as a primer; (c) amplifying a product of step (b) by PCR, using one or more of the linkers as a primer. In another embodiment, an amplification step other than PCR may be used. In another embodiment, the method may further comprise size separation, phenol/chloroform purification, or ethanol precipitation. Methods for these techniques are known in the art and are described, for example, in Example 1 and in Molecular Cloning, (2001), Sambrook and Russell, eds. Each such method represents a separate embodiment of the present invention.

In one embodiment of the current invention, an RNA molecule in the RBP-RNA complex of interest is amplified. In one embodiment, the amplification utilizes PCR. “PCR”, in one embodiment, is a process in which nucleic acid is amplified (i.e., the copy number is increased). Briefly, a nucleic acid molecule desired to be amplified is incubated with oligonucleotides known as “primers”, mononucleotides, and a polymerase, and is put though a cycle of temperature changes that facilitate denaturation, annealing of the primers to the molecule desired to be amplified, and extension of the primers to create additional copies of the molecule. Generally, PCR primers will be identical or similar in sequence to opposite strands of the template to be amplified. The 5′ terminal nucleotides of the two primers may coincide with the ends of the amplified material. PCR can be used to amplify specific RNA sequences, specific DNA sequences from total genomic DNA, and cDNA transcribed from total cellular RNA, bacteriophage or plasmid sequences, etc. In some embodiments, the XL-PCR kit (PE Biosystems), nested primers, and commercially available cDNA or genomic DNA libraries may be used to extend the nucleic acid sequence. In some embodiments, primers may be designed using commercially available software, such as OLIGONUCLEOTIDE 4.06 primer analysis As used herein, PCR is considered to be one, but not the only, example of a nucleic acid polymerase reaction method for amplifying a nucleic acid test sample. Methods for amplification of nucleic acid are known to those skilled in the art (see, for example, Mullis et al., Cold Spring Harbor Symp. Quant, Biol., 51:263; Erlich et al, PCR Technology (Stockton Press, N.Y., 1989); and U.S. Pat. No. 4,683,195 and references cited therein). Each such method represents a separate embodiment of the current invention.

In one embodiment, “cDNA” refers to a DNA molecule that is complementary to an RNA molecule. In one embodiment, the RNA molecule serves as a substrate for the synthesis of the DNA molecule.

In one embodiment, the present invention provides a method of detecting an RNA motif which represents a binding site on the RNA molecule for the RBP. In one embodiment, the motif may be detected. In one embodiment, the detection comprises partial fragmentation, digestion, hydrolysis, or physical or chemical treatment, as described for step (b) hereinabove. Each of these methods represents a separate embodiment of the invention.

A method for identifying a candidate RNA motif that may mediate binding to an RNA binding protein of interest, comprising the steps of (a) purifying a plurality of RNA molecules interacting with said RNA binding protein of interest by the CLIP method; (b) obtaining sequences from a subset of said plurality of RNA molecules; and (c) detecting a presence of said candidate RNA motif in two of said sequences, thereby identifying a candidate RNA motif that may mediate binding to an RNA binding protein of interest.

In one embodiment, said motif interacts with a Nova-1 protein. In another embodiment, said motif interacts with a Nova-2 protein. In another embodiment, said motif interacts with a HuA. In another embodiment, said motif interacts with a HuB. In another embodiment, said motif interacts with a HuC. In another embodiment, said motif interacts with a HuD. In another embodiment, said motif interacts with a FMRP. In another embodiment, said motif interacts with a FXRP1. In another embodiment, said motif interacts with a FXRP2. In another embodiment, said motif interacts with a combination of any of the above. In another embodiment, said motif interacts with any other RBP known in the art. Each possibility represents a separate embodiment of the present invention.

In one embodiment, the detection comprises determining the sequence of an RNA molecule from the RBP-RNA complex of interest. In one embodiment, the detection comprises analysis or comparison of multiple sequences identified by the present invention (Example 3). Analysis or comparison of multiple sequences may be performed manually, or using one of various methods for determining homology described hereinabove. Each of these methods represents a separate embodiment of the invention.

In one embodiment, the detection comprises amplification, as described herein. In one embodiment, the detection comprises inserting a nucleic acid molecule from the RBP-RNA complex of interest into a vector. In one embodiment, the detection comprises inserting an amplified product of a nucleic acid molecule from the RBP-RNA complex of interest into a vector. The insertion may be performed by any technique known in the art. Such techniques are described, for example, in Molecular Cloning, (2001), Sambrook and Russell, eds. or Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds. Each such technique represents a separate embodiment of the present invention.

In one embodiment, the detection involves RNA footprinting. Methods for RNA footprinting are known to those skilled in the art, and comprise the use of chemical or enzymatic means of nicking RNA. Such methods are described in Curr Opin Struct Biol 12:648-53 and references cited therein, and in Molecular Cloning, (2001), Sambrook and Russell, eds. or Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds. Each of these methods represents a separate embodiment of the invention

In one embodiment, the component that interacts with the RNA motif may be a protein. In another embodiment, the component that interacts with the RNA motif may be a nucleic acid. In another embodiment, the component that interacts with the RNA motif may be another molecule or compound (e.g., carbohydrate, lipid, vitamin, etc.) that associates with the RBP-RNA complex.

In one embodiment, the word “motif” refers to a portion of an RNA molecule. In another embodiment, the word “motif” refers to an entire RNA molecule. In another embodiment, the word “motif” refers to a portion of an RNA molecule that exhibits a particular structure. In another embodiment, the word “motif” refers to a particular structure. In another embodiment, the word “motif” refers to an element of structure that recurs in more than one context. In another embodiment, the word “motif” refers to a sequence that recurs in more than one context. Each of these represents a separate embodiment of the present invention.

In one embodiment of the present invention, the step of identifying an RNA molecule or motif in the RBP-RNA complex of interest comprises the use of hybridization to nucleic acid arrays. Those of skill in the art will appreciate that an enormous Number of array designs are suitable for the practice of this invention. High-density arrays may be used for a variety of applications, including, for example, gene expression analysis, genotyping, variant detection, and analysis of alternate splicing patterns.

Any of various techniques for large-scale polymer synthesis and probe array manufacturing known to one skilled in the art may be utilized, such as, for example, U.S. Pat. Nos. 5,143,854, 5,242,979, 5,252,743, 5,324,663, 5,384,261, 5,405,783, 5,412,087, 5,424,186, 5,445,934, 5,451,683, 5,482,867, 5,489,678, 5,491,074, 5,510,270, 5,527,681, 5,550,215, 5,571,639, 5,593,839, 5,599,695, 5,624,711, 5,631,734, 5,677,195, 5,744,101, 5,744,305, 5,753,788, 5,770,456, 5,831,070, 6,040,193 and 5,856,011. Each of these techniques represents a separate embodiment of the present invention.

In another embodiment, the present invention provides a method of assessing a level of association of an RNA transcript of interest with an RBP of interest, comprising the steps of: (a) contacting an RBP-RNA complex containing said RBP of interest with an agent that creates a covalent bond between two components of said RBP-RNA complex; (b) cleaving an RNA molecule of said RBP-RNA complex with an agent capable of cleaving a bond of said RNA molecule, thereby generating a fragment of said RNA molecule, wherein said fragment is at least 22 nucleotide bases in length; (c) selecting said RBP-RNA complex with a molecule that specifically interacts with a component thereof; (d) purifying said RBP-RNA complex, wherein said purifying comprises a chromatographic method; and (e) assessing a presence or amount of said RNA transcript of interest or a fragment thereof in said plurality of RNA molecules, thereby assessing a level of association of an RNA transcript of interest with an RBP of interest.

In another embodiment, the present invention provides a method of screening a test compound for its ability to modulate a level of association between an RBP and an RNA transcript, comprising the steps of: (a) assessing a first level of association between said RBP and said RNA transcript in a first cell by the method described in the previous paragraph, wherein said first cell has been contacted with said test compound; (b) assessing a second level of association between said RBP and said RNA transcript in a second cell by the method described in the previous paragraph, wherein said second cell has not been contacted with said test compound; and (c) comparing said first level of association with said second level of association, wherein a difference between said first level of association and said second level of association indicates an ability of said test compound to modulate a level of association between said RBP and said RNA transcript.

In another embodiment, the present invention provides a method of treating a disease or disorder in a subject, wherein the disease or disorder is associated with a function of an RNA binding protein, comprising contacting a cell in the subject with an agent that modulates an expression or activity of a gene, or a protein encoded by the gene, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 1-335, thereby treating a disease or disorder in a subject

In another embodiment, the present invention provides a method of treating a disease or disorder in a subject, wherein the disease or disorder is associated with a function of an RNA binding protein, comprising contacting a cell in the subject with an agent that modulates an expression or activity of a gene, or a protein encoded by the gene, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 336-449, thereby treating a disease or disorder in a subject.

In another embodiment, the present invention provides method of diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 1-335, comprising assessing a splicing pattern of the transcript in a biological sample from the subject; assessing a splicing pattern of a reference standard; and comparing the splicing pattern of the transcript to the splicing pattern of a reference standard, thereby diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject.

In another embodiment, the present invention provides method of diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject, wherein a transcript of the gene comprises a nucleic acid sequence set forth in SEQ ID No 336-449, comprising assessing a splicing pattern of the transcript in a biological sample from the subject; assessing a splicing pattern of a reference standard; and comparing the splicing pattern of the transcript to the splicing pattern of a reference standard, thereby diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject.

“Associated with,” in one embodiment, refers to a correlation of a parameter associated with the RNA binding protein with the presence of the disease or disorder. In another embodiment, “associated with” refers to a correlation of a parameter associated with the RNA binding protein with the progress of the disease or disorder. In another embodiment, “associated with” refers to a correlation of a parameter associated with the RNA binding protein with a predisposition to the disease or disorder.

In another embodiment, the present invention provides a method of analyzing one or more RNA sequences identified by a method of the invention. Such analysis may, in one embodiment, be used to reveal motifs or patterns among the RNA sequences (Example 3). In another analysis, such analysis may be used to reveal RNA molecules that were not previously known to interact with the component of the RBP-RNA complex that was selected (Examples 3-5). In another analysis, such analysis may be used to reveal information about the expression pattern or splicing pattern of a gene encoded by the parent RNA molecules of the RNA fragments analyzed (Example 3).

In another embodiment, the present invention provides a method of identifying a motif on an RNA of the RBP-RNA complex of interest that interacts with a component of the RBP-RNA complex of interest. In one embodiment, the motif may be a protein-binding site (Example 3). In another embodiment, the analysis may reveal a previously unknown binding motif. In another embodiment, the analysis may improve existing knowledge about a binding motif, such as information about the role of surrounding sequence (Example 3). In another embodiment, information from the location of binding sites may, in conjunction with information about the splicing or expression pattern of the parent RNA molecules of the fragments, reveal information about the role of the selected component of the RBP-RNA complex in regulating the expression or splicing pattern of RNA molecules.

In one embodiment, identification or analysis of RNA fragments purified by the present invention may comprise the use of nucleic acid arrays. In one embodiment, the nucleic acid array may be a high-density array. A high-density array will typically include a number of probes that specifically hybridize to the nucleic acid(s) whose expression is to be detected. Array based methods for monitoring gene expression such as those described in U.S. Pat. Nos. 5,800,992, 5,871,928, 5,925,525, 6,040,138 and PCT Application WO92/10588 (published on Jun. 25, 1992), may be utilized, in another embodiment of this invention. In some embodiments, these methods of monitoring gene expression involve (1) providing a pool of target nucleic acids identified by the present invention (2) hybridizing the nucleic acid sample to a high density array of probes and (3) detecting the hybridized nucleic acids and calculating a relative or absolute Number of each transcript detected. Each of these techniques represents a separate embodiment of the present invention.

For genotyping and variant detection, the high-density array may, in some embodiments, include a number of probes which are designed to assay a particular position which is believed or known to be associated with sequence variation. Array based methods for variant detection used according to this aspect of the invention may be as described, for example, in U.S. Pat. Nos. 5,837,832, 5,856,104, 5,856,092, 5,858,659, 6,027,880 and 5,925,525. In some embodiments, these methods of variant detection involve (1) providing a pool of target nucleic acids comprising DNA from the region(s) to be interrogated (2) hybridizing the nucleic acid sample to a high density array of probes and (3) detecting the hybridized nucleic acids and determining the presence or absence of a sequence variant. Each of these techniques represents a separate embodiment of the present invention.

In another embodiment, the design and use of nucleic acid arrays is carried out as described, for example, in Burney et al (Am J Psychiatry 160: 657-66) or Ahmed (J Environ Sci Health Part C Environ Carcinog Ectoxicol Rev. 20: 77-116). Each of these techniques represents a separate embodiment of the present invention.

In another embodiment, an embodiment of the CLIP method may be combined with laser capture micro-dissection (LCM). LCM under direct microscopic visualization permits rapid one-step procurement of selected cell populations from a section of complex, heterogeneous tissue. The method entails placing a thin thermoplastic film (such as, for example, ethylene vinyl acetate polymer) over a tissue section, visualizing the tissue microscopically, and selectively adhering the cells of interest to the film with a fixed-position, short-duration, focused pulse from an infrared laser. Strong focal adhesion allows selective procurement of the targeted cells. The film with the procured tissue is then removed from the section and placed directly into DNA, RNA, or enzyme buffer for processing. The technique is known to those skilled in the art (see, for example, Emmert-Buck M R et al, Science 274: 998-1001). The cellular material detaches from the film and is ready for standard processing.

The use of LCM in combination with an embodiment of the CLIP method would facilitate, in one embodiment, the purification of RNA interacting with an RBP of interest from a specific region of tissue visualized in a microscope. In one embodiment, the specific region may be, for example, neuronal dendritic layers.

In another embodiment, LCM, or a similar technology, may be combined with Deep UV microscopy technology to facilitate the formation of covalently bound protein-RNA complexes from a specific region visualized in a microscope. In one embodiment, laser wavelength in the range of 245-260 nm may be used in Deep UV microscopy, a range of wavelengths that may be used to catalyze the formation of covalent bonds. The use of Deep UV microscopy is known to those skilled in the art (see, for example, U.S. Pat. No. 5,482,817).

In another embodiment, there is provided a method for isolating an unknown RBP present in an RBP-RNA complex containing a known component. This method is similar, in one embodiment, to the CLIP method described above, but the target to be identified is an RBP that may be different from the known component used to select the RBP-RNA complex. In one embodiment, the unknown RBP may be bound to an RNA or other nucleic acid molecule at the same time that the known component is bound to the same molecule. In another embodiment, the unknown RBP may bind a different RNA that can exist in an RBP-RNA complex also containing the known component. This method comprises the steps of contacting a biological sample with an agent that results in a covalently bound RBP-RNA complex in the biological sample; obtaining RNA fragments from the biological sample; selecting the RBP-RNA complex containing the known component with a molecule that specifically interacts with the known component; purifying the RBP-RNA complex containing the known component under stringent conditions; and isolating the unknown RBP from the RBP-RNA complex containing the known component.

In another embodiment, there is provided a method for identifying an unknown RBP present in an RBP-RNA complex containing a known component, comprising the steps of contacting a biological sample with an agent that results in a covalently bound RBP-RNA complex in the biological sample; obtaining RNA fragments from the biological sample; selecting the RBP-RNA complex containing the known component with a molecule that specifically interacts with the known component; purifying the RBP-RNA complex containing the known component under stringent conditions; and identifying the unknown RBP from the RBP-RNA complex containing the known component.

It is to be understood that any embodiment listed herein for effecting the CLIP methods of this invention may be utilized for isolating and/or identifying an unknown RBP present in an RBP-RNA complex containing a known component, and represent additional embodiments of this invention. In one embodiment, the unknown RBP may be identified, by antibody detection, mass spectrometry, or any other method well known in the art. Mass spectrometry may be carried out by any method well known to those skilled in the art, such as those described in U.S. Pat. No. 6,586,727. Each method of mass spectrometry or protein identification represents a separate embodiment of the present invention.

In one embodiment of this method, “unknown RBP” refers to an RBP that has not been previously identified. In another embodiment, “unknown RBP” refers to a previously identified protein not known to be an RBP. In another embodiment, “unknown RBP” refers to a previously identified RBP not known to be present in an RBP-RNA complex with the known component of the RBP-RNA complex. In another embodiment, “unknown RBP” refers to the fact that information is lacking about conditions permitting association of the RBP with complexes containing the known component. In another embodiment, “unknown RBP” refers to the fact that information is lacking about the stoichiometry, affinity, binding site of the RBP, or any other aspect of complexes containing the RBP and the known component.

In one embodiment, a covalent bond is created between the known component and an RNA in the complex, and another covalent bond is created between the unknown RBP and an RNA in the complex. The presence of these covalent bonds allows the known component, the RNA, and the unknown RBP to remain associated throughout stringent purification steps. The use of these stringent steps increases the purity of the isolated product relative to schemes using less stringent purification steps.

In one embodiment, the unknown RBP is isolated from the RBP-RNA complex. In one embodiment, isolation may comprise selectively removing the unknown RBP from the RBP-RNA complex. In another embodiment, isolation may comprise removing the unknown RBP from the RBP-RNA complex by a technique that also removes one or more other proteins. In another embodiment, the amount of the unknown RBP is assessed. In another embodiment, the presence of the unknown RBP under various conditions is assessed. In another embodiment, the binding site of the unknown RBP is characterized or identified.

In another embodiment, the unknown RBP can be analyzed, for example, by SDS-PAGE gel electrophoresis, Western blotting and detection with specific antibodies, phosphoamino acid analysis, protease digestion, protein sequencing, or isoelectric focusing. The use of these techniques is known to those skilled in the art, and is described in, for example, Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds. Each technique represents a separate embodiment of the present invention.

In another embodiment, the invention provides a material that may be used in a screen to identify bioactive molecules. In one embodiment, the material may comprise a motif on an RNA of the RBP-RNA complex of interest. In another embodiment, the material may comprise a derivative of the motif. In another embodiment, the material may comprise a sequence homologous to the motif. In another embodiment, the material may comprise a molecule containing the motif. In another embodiment, the material may comprise a sequence fragment identified by the method of this invention. In another embodiment, the material may comprise a sequence identified by the method of this invention (SEQ ID No 1-449).

As used herein, the terms “homology”, “homologue” or “homologous”, in any instance, indicate that the sequence referred to, whether an amino acid sequence, or a nucleic acid sequence, exhibits, in one embodiment at least 70% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 72% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 75% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 77% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 80% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 82% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 85% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 87% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 90% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 92% correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits at least 95% or more correspondence with the indicated sequence. In another embodiment, the amino acid sequence or nucleic acid sequence exhibits 95-100% correspondence to the indicated sequence. Similarly, as used herein, the reference to a correspondence to a particular sequence includes both direct correspondence, as well as homology to that sequence as herein defined.

In another embodiment, the invention provides compounds that interact with a motif of an isolated nucleic acid sequence. In one embodiment, the nucleic acid is RNA or a derivative thereof. In one embodiment, the sequences comprise SEQ ID No 1-335.

In another embodiment, the sequences comprise SEQ ID No 336-449.

In another embodiment, the sequences comprise 63, 64, 76, 77, 78, 84, and 292-335.

In another embodiment, the present invention provides the use of the RNA motifs, sequence fragments, or sequences in a screening assay to identify bioactive molecules that interact with the RNA motifs, sequence fragments, or sequences. A molecule found to interact with the an RNA motif, sequence fragment, or sequence is likely, in one embodiment, to modulate the activity of an RNA molecule containing the motif, sequence fragment, or sequence in vivo.

In one embodiment, the screening assay can be performed in a cell-based system. In one embodiment, the screening assay can be performed in a cell-free system. Cell-based assays can be native, i.e., cells that normally express the enzyme, as a biopsy or expanded in cell culture. In an alternate embodiment, the cell-based assay involves recombinant host cells expressing the enzyme protein. In one embodiment, the screening assay can be a high-throughput screen. Each such system represents a separate embodiment of the present invention.

To perform cell free drug screening assay, it is sometimes desirable, in one embodiment, to immobilize either the material or its target molecule to facilitate separation of complexes from un-complexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Techniques for immobilizing nucleic acids on matrices can be used in the drug screening assay. Matrices are then combined with the cell lysates (e.g., 35 S-labeled) and the candidate compound, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads are washed to remove any unbound label, and the matrix immobilized and radiolabel determined directly, or in the supernatant after the complexes are dissociated. Alternatively, the complexes can be dissociated from the matrix and separated by SDS-PAGE, and the level of target molecule bound to the material found in the bead fraction can be quantitated from the gel using standard electrophoretic techniques. For example, either the material or its target molecule can be immobilized utilizing conjugation of biotin and streptavidin using techniques well known in the art. Alternatively, antibodies reactive with the material but which do not interfere with binding of the material to its target molecule can be derivatized to the wells of the plate, and the material trapped in the wells by antibody conjugation. Methods for detecting such complexes include immuno-detection of complexes using antibodies reactive with the target molecule, or which are reactive with the material and compete with the target molecule, as well as enzyme-linked assays, which rely on detecting an enzymatic activity associated with the target molecule. These methods represent additional embodiments of the present invention.

Candidate compounds include, for example, 1) peptides such as soluble peptides, including Ig-tailed fusion peptides and members of random peptide libraries (see, e.g., Lam et al., Nature 354:82-84; Houghten et al., Nature 354:84-86) and combinatorial chemistry-derived molecular libraries made of D- and/or L-configuration amino acids; 2) phosphopeptides (e.g., members of random and partially degenerate, directed phosphopeptide libraries, see, e.g., Songyang et al., Cell 72:767-778); 3) antibodies (e.g., polyclonal, monoclonal, humanized, anti-idiotypic, chimeric, and single chain antibodies as well as Fab, F(ab′).sub.2, Fab expression library fragments, and epitope-binding fragments of antibodies); and 4) small organic and inorganic molecules (e.g., molecules obtained from combinatorial and natural product libraries). The use of each such group of compounds, as well as any other used in the art, represents a separate embodiment of the current invention.

The motifs, sequence fragments, or sequences claimed for use in a screen to identify bioactive molecules may, in one embodiment, be used in competition binding assays in methods designed to discover compounds that interact with the motifs, sequence fragments, or sequences (e.g. binding partners and/or ligands), Thus, a compound is exposed to the material under conditions that allow the compound to bind or to otherwise interact with the motifs, sequence fragments, or sequences. A molecule that normally interacts with the materials is also added to the mixture. If the test compound interacts with the material, it decreases the amount of complex formed. This type of assay may be used in cases in which compounds are sought that interact with specific regions of the material. In another embodiment, the screening method involves contacting a biological sample with a compound capable of interacting with the RNA motifs, sequence fragments, or sequences such that the interaction can be detected. Such an assay can be provided in a single detection format or a multi-detection format such as an antibody chip array. These methods represent separate embodiments of the present invention.

In another embodiment, the motifs, sequence fragments, or sequences of the present invention can be used to screen a compound for its ability to stimulate or inhibit an interaction between RNA motifs, sequence fragments, or sequences of the present invention and a molecule that normally interacts with the motifs, sequence fragments, or sequences. In one embodiment, the molecule is a protein. In another embodiment, the molecule is an RBP. In another embodiment, the molecule interacts with an RBP. In another embodiment, the molecule may be a different molecule. Such an assay may include the steps of contacting a biological sample with a candidate molecule under conditions that allow the motifs, sequence fragments, or sequences to interact with the respective molecule, and to detect the formation of a complex between the motifs, sequence fragments, or sequences and the interacting molecule or to detect the biochemical consequence of the interaction with the motifs, sequence fragments, or sequences and the interacting molecule, such as, for example, splicing, export, or localization of a parent RNA or expression, activity, or concentration of a protein encoded by a parent RNA. Each such assay represents a separate embodiment of the current invention.

As used herein, the phrase “parent RNA” or “parent RNA molecule”, in one embodiment, refers to a larger RNA molecule which comprises the material. In another embodiment, “parent RNA” or “parent RNA molecule” can be the material itself.

It will be appreciated by one skilled in the art that bioactive molecules identified by a screen of the present invention may be further assayed to gain information about their biological activities. For example, the bioactive molecules may be assayed for their effect on the RNA splicing, export, or localization or protein expression, activity, or concentration, of RNA or protein molecules in the cell. In one embodiment, the RNA or protein molecules used in the assay may comprise the motifs, sequence fragments, or sequences of the present invention. The effect of the bioactive molecules may be assessed for the RNA or protein in its natural state. In another embodiment, the effect of the bioactive molecules may be assessed for the RNA or protein in an altered form that causes a specific disease or pathology associated with the enzyme.

Bioactive molecules identified in these screens can be further screened to determine their effect of a parent RNA or protein as described herein. Further, these compounds can be tested in animal or invertebrate systems to determine activity/effectiveness. Compounds can be identified that activate (agonist) or inactivate (antagonist) a parent RNA or protein to a desired degree.

In one embodiment, bioactive molecules identified in these screens may modulate splicing of a parent RNA molecule of the material. In another embodiment, bioactive molecules identified in these screens may modulate export of a parent RNA molecule from the nucleus. In another embodiment, bioactive molecules identified in these screens may modulate localization of a parent RNA molecule to or near the inhibitory synapse. In another embodiment, bioactive molecules identified in these screens may modulate export of a parent RNA molecule from the cell. In another embodiment, bioactive molecules identified in these screens may modulate expression of a protein encoded by a parent RNA molecule. In another embodiment, bioactive molecules identified in these screens may modulate steady-state concentration of a protein encoded by a parent RNA molecule. In another embodiment, bioactive molecules identified in these screens may modulate activity of a protein encoded by a parent RNA molecule. Methods for assessment of RNA splicing, export, or localization or protein expression, activity, or concentration are known to those skilled in the art (FIG. 4; also see Suzuki et al, Am J Med. Genet. 121B:7-13).

The invention further includes, in another embodiment, end point assays to further characterize bioactive molecules for their biological function. The assays may, in one embodiment, involve an assay of events in a signal transduction pathway. Thus, the phosphorylation of a substrate, activation of a protein, and a change in the expression of genes that are up- or down-regulated in response to the enzyme protein dependent signal cascade can be assayed. In one embodiment, any of the biological or biochemical functions mediated by a protein encoded by the motifs, sequence fragments, or sequences of the present invention can be used as an endpoint assay. Specifically, a biological function of a cell or tissues that expresses the motifs, sequence fragments, or sequences of the present invention can be assayed. Each such assay represents a separate embodiment of the current invention.

Bioactive molecules identified by a screen of the present invention can be used to treat a subject with a disorder or disease, as will be appreciated by one skilled in the art. For example, bioactive molecules may be identified that modulate splicing or expression of RNA molecules comprising a binding site. If an aberrant splicing or expression pattern of a gene is associated with a disease state or disorder, and that gene contains a binding site that interacts with the bioactive molecule identified, the bioactive molecule might constitute a therapy or treatment for the disease or disorder. In another embodiment, a bioactive molecule that modulates the export or localization of an RNA molecule, or the activity or concentration of a protein might similarly constitute a therapy or treatment for a disease associated with aberrant export or localization an RNA molecule, provided that the RNA molecule contains a binding site that interacts with the bioactive molecule identified. In another embodiment, a bioactive molecule that modulates the activity or concentration of a protein might similarly constitute a therapy or treatment for a disease associated with aberrant activity or concentration of a protein, provided that the RNA molecule encoding the protein contains a binding site that interacts with the bioactive molecule identified. The therapy would include the steps of administering the bioactive molecule in a pharmaceutical composition to a subject in need of such treatment. These therapeutic methods represent embodiments of the present invention for all applications of therapeutic methods mentioned herein.

As used herein, the term “disorder” may refer, in one embodiment, to any type of disease, disorder, or symptom

This invention further pertains to novel bioactive molecules identified by the above-described screening assay. Accordingly, it is within the scope of this invention to further use a bioactive molecules identified as described herein in an appropriate animal model. For example, an agent identified as described herein (e.g., an RNA modulating agent, protein modulating agent, an antisense nucleic acid molecule, an antibody specific for the material, or a binding partner specific for the material) can be used in an animal or other model to determine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified as described herein can be used in an animal or other model to determine the mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above-described screening assay for treatments as described herein. These agents, and their uses described herein, represent embodiments of the present invention.

In another embodiment, the invention provides a method of assessing an expression of a gene of a cell, tissue, or biological sample, comprising the following steps:

-   a. purifying a plurality of RBP-RNA complexes from the cell, tissue,     or biological sample by the CLIP method; -   b. identifying an RNA molecule in the plurality of RBP-RNA     complexes; and -   c. assessing the presence or amount of the gene among the plurality     of RBP-RNA complexes,     thereby assessing an expression of a gene of a cell, tissue, or     biological sample.

In one embodiment, the preceding method of assessing an expression of a gene can be used to generate a gene expression profile of the cell, tissue, or biological sample. In one embodiment of this method, the expression of multiple genes of the cell, tissue, or biological sample is assessed. The expression data are then, in one embodiment, compiled to obtain a gene expression profile. Analysis or comparison of multiple sequences performed manually, in one embodiment. In another embodiment, the analysis is performed using one of various methods for determining homology described hereinabove.

In another embodiment, the present invention provides a method of screening a test compound for its ability to modulate expression of a gene in a cell, comprising the steps of: (a) purifying a first plurality of RNA binding protein-RNA complexes from the cell by the CLIP method, wherein the cell has been contacted with the test compound; (b) identifying a first plurality of RNA molecules in the first plurality of RBP-RNA complexes; (c) assessing an amount of the gene among the first plurality of RNA molecules; (d) purifying a second plurality of RNA binding protein-RNA complexes from the cell by the CLIP method, wherein the cell has not been contacted with the test compound; (e) identifying a second plurality of RNA molecules in the second plurality of RBP-RNA complexes; and (f) assessing an amount of the gene among the second plurality of RNA molecules; wherein a difference between the amount of the gene in the first plurality of RNA molecules and the amount of the gene in the second plurality of RNA molecules indicates an ability of the test compound to modulate expression of a gene in a cell.

In another embodiment, the invention provides a method of screening a test compound for its ability to modulate gene expression in a cell, tissue, or biological sample, comprising the steps of (a) generating a first gene expression profile of a cell, tissue, or biological sample according to the method of generating a gene expression profile described herein, wherein the cell, tissue, or biological sample has been contacted with a test compound; (b) generating a second gene expression profile of a cell, tissue, or biological sample according to the method of generating a gene expression profile described herein, wherein the cell, tissue, or biological sample has not been contacted with the test compound; and (c) identifying differences between the first and second gene expression profile, differences indicating that the test compound can modulate gene expression in the cell, tissue, or biological sample. The test compound need not be one identified by a screen of the current invention.

In another embodiment, there is provided a method of treating a disease or disorder in a subject, comprising contacting a cell in the subject with an agent that modulates the expression or activity of a gene, or a protein encoded by the gene, wherein the gene has a sequence comprising a nucleic acid sequence as set forth in SEQ ID No 1-335, thereby treating the disease or disorder.

In another embodiment, there is provided a method of treating a disease or disorder in a subject, comprising contacting a cell in the subject with an agent that modulates the expression or activity of a gene, or a protein encoded by the gene, wherein the gene has a sequence comprising a nucleic acid sequence as set forth in SEQ ID No 336-449, thereby treating the disease or disorder.

In another embodiment, the present invention provides a method of using the RNA binding motifs to diagnose or screen for disease or predisposition to a disease mediated by parent genes of RNA fragment sequences of the present invention. In one embodiment, the disease or ailment is Paraneoplastic Opsoclonus Myoclonus Ataxia (POMA). In another embodiment, the disease is another neurologic disorder. In another embodiment, the disease is a non-neurologic disorder. In another embodiment, the disease is an autoimmune disorder. Any of the methods for diagnosis described herein may be used. Each method represents a separate embodiment of the present invention.

Nova was the first mammalian tissue-specific splicing factor identified. Nova is a neuron-specific RBP targeted in patients with the autoimmune disorder paraneoplastic opsoclonus-myoclonus ataxia. Nova proteins were identified as the autoantigens in POMA using high titer antibodies to clone cDNAs encoding two highly homologous KH-type RBPs, Nova-1 and Nova-2. Antisera from 6/6 POMA patients were found to block the interaction of Nova protein with RNA. These antibodies have been hypothesized to gain access to neurons and play a role in provoking the neuronal degeneration in POMA by blocking critical RNA-protein interactions.

POMA, also known as opsoclonus-myoclonus-ataxia syndrome, is an autoimmune neurological disorder found in cancer patients, which is characterized by a failure of the inhibition of brainstem and spinal motor systems. The clinical syndrome of paraneoplastic opsoclonus is characterized by the acute onset of opsoclonus and truncal ataxia, often accompanied by encephalopathy, myoclonus and a cerebrospinal fluid pleocytosis, but with no accompanying loss of neurons from the cerebellum, brainstem, cerebral hemispheres, or spinal cord. Unlike most other paraneoplastic syndromes, the course is often remitting and relapsing.

As used herein, “neurologic disorder” refers, in one embodiment, to a disease or disorder selected from the group consisting of epilepsy, convulsions, and seizure disorders. In another embodiment, the neurological disease or disorder is associated with spasticity. In another embodiment, the neurological disease or disorder is a neurodegenerative disorder. In another embodiment, the neurological disease or disorder is selected from the group consisting of spasticity, skeletal muscle spasms, restless leg syndrome, anxiety, stress, multiple sclerosis (MS), Sjogren's Syndrome, stroke, head trauma, spinal cord injury, Parkinson's Disease, Huntington's Disease, Alzheimer's Disease, amyotrophic lateral sclerosis, myotonic dystrophy, spinocerebellar ataxia, Spinal Muscular Atrophy, paraneoplastic neurologic disorders, multiple system atrophy, amyotrophic lateral sclerosis, human T-lymphotropic virus type 1 (HTLV-1)-associated myelopathy/tropical spastic paraparesis (HAM/ISP), migraine, headaches, and bipolar disorder. In another embodiment, the treatment alleviates or prevents convulsions or spasticity. All of these represent separate embodiments of the present invention.

As used herein, “autoimmune disorder” refers, in one embodiment, to a disease or disorder selected from the group consisting of autoimmune endocarditis, SLE, rheumatoid arthritis (RA), systematic sclerosis (SSc), celiac disease, insulin dependent diabetes mellitus, and juvenile rheumatoid arthritis (JRA).

Another embodiment of the present invention provides a method of diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject, the gene having a sequence comprising a nucleic acid sequence set forth in SEQ ID No 1-335, comprising assessing a splicing pattern of a transcript of said gene, assessing a splicing pattern of a reference standard, and comparing said splicing pattern of a transcript of said gene to said splicing pattern of a reference standard, thereby diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject.

Another embodiment of the present invention provides a method of diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject, the gene having a sequence comprising a nucleic acid sequence set forth in SEQ ID No. 336-449, comprising assessing a splicing pattern of a transcript of said gene, assessing a splicing pattern of a reference standard, and comparing said splicing pattern of a transcript of said gene to said splicing pattern of a reference standard, thereby diagnosing a disease or disorder associated with an alternate splicing pattern of a gene in a subject.

If one embodiment, “associated with” refers to a correlation between the alternate splicing pattern and the disease or disorder. In another embodiment, “associated with” refers to a causation of the disease or disorder by the alternate splicing pattern. In another embodiment, “associated with” refers to a predisposition to the disease or disorder in subjects with the alternate splicing pattern. Each possibility represents a separate embodiment of the present invention.

In one embodiment, the splicing pattern is assessed by RT-PCR analysis as described herein. Alternately, the splicing pattern can be assessed by any method known to those skilled in the art, as described, for example, in Molecular Cloning, (2001), Sambrook and Russell, eds. or Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds. Each such method represents a separate embodiment of the present invention.

In one embodiment, “reference standard” refers to a sample derived from one or more individuals that do not exhibit the disorder of interest. A significant departure in a pattern observed in a sample from a subject from a pattern observed in a reference standard may be indicative of a disorder, or predisposition for a disorder.

In one embodiment, the disorder is a neurological disorder. In another embodiment, the disorder is POMA. In another embodiment, the disorder is an autoimmune disorder.

In another embodiment, the disorder is a cancer or disorder involving neoplastic cells. “Neoplastic cells” refers, in one embodiment, to cells whose normal growth control mechanisms are disrupted (typically by accumulated genetic mutations), thereby providing potential for uncontrolled proliferation. Thus, “neoplastic cells” can include both dividing and non-dividing cells. For purposes of the invention, neoplastic cells include cells of tumors, neoplasms, carcinomas, sarcomas, leukemias, lymphomas, and the like. In another embodiment, “neoplastic cells” can include central nervous system tumors, especially brain tumors. These include glioblastomas, astrocytomas, oligodendrogliomas, meningiomas, neurofibromas, ependymomas, Schwannomas, neurofibrosarcomas, etc. In another embodiment, “neoplastic cells” can include either benign or malignant neoplastic cells.

Antibodies to Nova and Neuronal Hu proteins are elicited by tumors, linking expression or activity of each of these proteins to cancer. The present invention has disclosed that a set of RNA molecules with the sequences as set forth in SEQ ID No 1-335 interact with Nova in a transformed cell line (FIG. 3). Thus, the particular splicing pattern or the expression of these RNA molecules is also characteristic of neoplastic transformation. In another embodiment, the particular splicing or expression pattern of one or more of these sequences may be useful as a diagnostic marker for the presence of cancer or other such disorder involving neoplastic cells. In another embodiment, a particular splicing or expression pattern of one or more of these sequences may be useful as a diagnostic marker for a particular stage of cancer or associated disorders involving neoplastic cells.

In another embodiment, the disorder is arthritis. HuR, a homologue of Neuronal Hu proteins, regulates the stability and/or nuclear export of the RNA of early response genes (Gallouzi, I E et al, Science 294: 1895-1901), which are expressed in arthritic joints (Aicher W K et al, Arthritis Rheum 48:348-59), indicating that Neuronal Hu proteins and homologous proteins may play a role in the etiology of arthritis. Since RBPs regulate the splicing of multiple targets (Examples 3-8), the splicing pattern or the expression of other neuronal Hu protein targets may be characteristic of arthritis. Thus, a particular splicing or expression pattern or one or more of these sequences may be useful as a diagnostic marker of the presence of arthritis.

In another embodiment, the disorder is atherosclerosis. Early response gene-1 (erg-1), which is regulated by HuR, is upregulated in atherosclerotic lesions (Bea F et al, Atherosclerosis 167:187-194). These data indicate that Neuronal Hu proteins and homologous proteins may play a role in the etiology of atherosclerosis. Since RBPs regulate the splicing of multiple targets (Examples 3-8), the splicing pattern or the expression of other neuronal Hu protein targets may be characteristic of atherosclerosis. Thus, a particular splicing or expression pattern or one or more of these sequences may be useful as a diagnostic marker of the presence of atherosclerosis.

The term “arthritis”, in one embodiment, refers to rheumatoid arthritis, ankylosing spondylitis, juvenile rheumatoid arthritis, psoriatic arthritis, or any other type of arthritis or arthritis-like disorder.

In another embodiment, the disease is a metabolic disease. In another embodiment, the disease is diabetes. Nova-2 knockout mice exhibit a diabetes-like phenotype (Example 9).

In one embodiment, splicing of the parent gene is assessed. In another embodiment, branch point recognition of the parent gene is assessed. In another embodiment, export from the nucleus of the parent gene is assessed. In another embodiment, the export from the cell of the parent gene is assessed. In another embodiment, localization of the parent gene or its RNA to or close to inhibitory synapses is assessed. In another embodiment, expression level of a protein encoded for by the gene is also assessed. In another embodiment, the steady-state concentration of a protein encoded for by the gene is also assessed. In another embodiment, activity of a protein encoded for by the gene is also assessed.

In another embodiment, a patient may have a variant sequence in the parent RNA or protein whose splicing, localization, export, activity, concentration, or expression is assessed. In another embodiment the patient may have a normal sequence in the parent RNA or protein. Thus, the material can be isolated from a biological sample and assayed for the presence of a genetic mutation that results in aberrant sequence. This includes amino acid substitution, deletion, insertion, rearrangement, (as the result of aberrant splicing events), and inappropriate post-translational modification. Analytic methods for determining the presence of mutations include altered restriction enzyme analysis, nucleotide sequencing, electrophoretic mobility, altered tryptic peptide digest, altered enzyme activity in cell-based or cell-free assay, alteration in substrate or antibody-binding pattern, altered isoelectric point, direct amino acid sequencing, and any other of the known assay techniques useful for detecting mutations in a polynucleotide or protein. Such an assay can be provided in a single detection format or a multi-detection format such as an antibody chip array. Each such method represents a separate embodiment of the present invention.

In vitro techniques for detection of the motifs, sequence fragments, or sequences of the present invention or the proteins they encode include enzyme linked immuno-absorbent assays (ELISAs), immunoprecipitation and immunofluorescence using a detection reagent, such as an antibody or protein binding agent. Alternatively, the material can be detected in vivo in a subject by introducing into the subject a labeled antibody against the material or another type of detection agent. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In addition, methods that detect the allelic variant of a material expressed in a subject and methods which detect fragments of a material in a sample could also be used. Each such method represents a separate embodiment of the present invention.

The motifs, sequence fragments, or sequences of the present invention may be useful, in one embodiment, in pharmacogenomic analysis. Pharmacogenomics deal with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. The clinical outcomes of these variations result in severe toxicity of therapeutic drugs in certain individuals or therapeutic failure of drugs in certain individuals as a result of individual variation in metabolism. Thus, the genotype of the individual can determine the way a therapeutic compound acts on the body or the way the body metabolizes the compound. Further, the activity of drug metabolizing enzymes affects both the intensity and duration of drug action. Thus, the pharmacogenomics of the individual permit the selection of effective compounds and effective dosages of such compounds for prophylactic or therapeutic treatment based on the individual's genotype. The discovery of genetic polymorphisms in some drug metabolizing enzymes has explained why some patients do not obtain the expected drug effects, show an exaggerated drug effect, or experience serious toxicity from standard drug dosages. Polymorphisms can be expressed in the phenotype of the extensive metabolizer and the phenotype of the poor metabolizer. Accordingly, genetic polymorphism may lead to allelic protein variants of the enzyme protein in which one or more of the enzyme functions in one population is different from those in another population. The materials thus allow a target to ascertain a genetic predisposition that can affect treatment modality. Accordingly, substrate dosage would necessarily be modified to maximize the therapeutic effect within a given population containing a polymorphism. The use of materials of this invention pharmacogenomic analysis thus represents an additional embodiment of the present invention.

Another embodiment of the present invention provides a method of treating a symptom, disorder, or disease in a subject, comprising contacting a cell in the subject with an agent that modulates the expression or activity of a parent gene or a protein encoded by the parent gene of a gene motif, sequence fragment, or sequence, comprising a sequence as set forth in SEQ ID No 1-335, thereby treating the disorder. In another embodiment, the gene motif, sequence fragment, or sequence comprises a sequence as set forth in SEQ D No 336-449. In one embodiment, the agent is a parent gene, gene motif, sequence fragment, or sequence of the present invention. In one embodiment, the cell that is contacted is a neuron. Any therapeutic method disclosed that may modulate one of the parent genes, gene motifs, sequence fragments, or sequences described here is considered a part of the present invention.

In one embodiment, the agent is a nucleic acid. Protocols for introducing a nucleic acid or vector of the invention into cells may comprise, for example: direct DNA uptake techniques, virus, plasmid, linear DNA or liposome mediated transduction, or transfection, direct injection, magnetoporation, receptor-mediated uptake and others. In another embodiment, recombinant molecules encoding the RNA molecules are introduced into host cells such that they become integrated into the host cell genome. In one embodiment, the recombinant molecule is flanked by sequences known to promote homologous recombination. In another embodiment, the integrated recombinant molecule is transcribed within the cell to produce a heterologous RNA molecule (see, for example, “Methods in Enzymology” Vol. 1-317, Academic Press, Current Protocols in Molecular Biology Ausubel et al, eds.; and in Molecular Cloning (2001), Sambrook and Russell, eds.; or other standard laboratory manuals). In another embodiment, a nucleic acid may, be chemically modified to be a PNA or another nucleic acid, or conjugate to a Trojan peptide as described herein. It is to be understood that any direct means or indirect means of intracellular access of a nucleic acid or vector of the invention is contemplated herein, and represents an embodiment thereof, of any application of introducing a nucleic acid or vector of the invention into cells mentioned herein.

In one embodiment of the present invention, “nucleic acids” refers to a string of at least two base-sugar-phosphate combinations. The term includes, in one embodiment, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). “Nucleotides” refers, in one embodiment, to the monomeric units of nucleic acid polymers. RNA may be in the form of a tRNA (transfer RNA), snRNA (small nuclear RNA), rRNA (ribosomal RNA), mRNA (messenger RNA), anti-sense RNA, small inhibitory RNA (siRNA), micro RNA (miRNA) and ribozymes. The use of siRNA and miRNA has been described (Caudy A A et al, Genes & Devel 16:2491-96 as references cited therein). DNA may be in form plasmid DNA, viral DNA, linear DNA, or chromosomal DNA or derivatives of these groups. In addition these forms of DNA and RNA may be single, double, triple, or quadruple stranded. The term also includes, in one embodiment, artificial nucleic acids that may contain other types of backbones but the same bases. Examples of artificial nucleic acids are PNAs (peptide nucleic acids), phosphorothioates, and other variants of the phosphate backbone of native nucleic acids. PNA contain peptide backbones and nucleotide bases, and are able to bind both DNA and RNA molecules. The use of phosphothiorate nucleic acids and PNA are known to those skilled in the art, and are described in, for example, Neilsen P E, Curr Opin Struct Biol 9:353-57; and Raz N K et al Biochem Biophys Res Commun. 297:1075-84. The production and use of nucleic acids is known to those skilled in art and is described, for example, in Molecular Cloning, (2001), Sambrook and Russell, eds. and Methods in Enzymology: Guide to Molecular Cloning Techniques (1987) Berger and Kimmel, eds. Each nucleic acid derivative represents a separate embodiment of the present invention.

In another embodiment of the present invention, nucleic acids may be conjugated to “Trojan peptides” such as HIV TAT peptide (Tat), transporten, and Antennapedia peptide. These peptides facilitate entry of the nucleic acids into cells, and their use is described in the literature. (Derossi et al, Trends Cell Biol 8:84-87; Simmons C G et al, Bioorg Med Chem Lett 7:3001-6; Pooga H et al, Nat Biotechnol 16:857-61). This technology may be able to introduce nucleic acids to the splicing machinery of a cell (Sergueev et al, Pharm Res 19:744).

In another embodiment, nucleic acids may comprise at least one modified base moiety which is selected from the group including, but not limited to: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5N-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid, wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl)uracil, and 2,6-diaminopurine.

In one embodiment, the disorder is a neurological disorder as described herein. In another embodiment, the disorder is PUMA, Multiple Sclerosis, Alzheimer's Disease, Huntington's Disease or Parkinson's Disease. In one embodiment, the disorder is an autoimmune disorder as described herein.

In another embodiment, the disorder may be a type of cancer or disorder involving neoplastic cells as described herein. Antibodies to Nova and Neuronal Hu proteins are elicited by tumors, linking expression or activity of each of these proteins to cancer. The present invention has shown that the RNA molecules as set forth in SEQ ID No 1-335 interact with Nova in a transformed cell line (FIG. 3), indicating that their splicing or expression pattern may play a role in the transformation or survival of transformed cells such as cancer cells. In addition, HuR, a homologue of Neuronal Hu proteins, is believed to regulate the stability and/or nuclear export of the RNA of early response genes, which have been implicated in angiogenesis (Tarnawski et al, J Mol Med 2003). In one embodiment, an antibody or other molecule that binds a protein encoded by a sequence as set forth in SEQ ID No 1-335 or 336-449 may be conjugated to a toxic compound or biological molecule for use in cancer chemotherapy.

In another embodiment, the disorder is arthritis. HuR, a homologue of Neuronal Hu proteins, regulates the stability and/or nuclear export of the RNA of early response genes, which are expressed in arthritic joints, indicating that Neuronal Hu proteins and homologous proteins may play a role in the etiology of arthritis through regulation of the splicing or expression of any of the genes containing the CLIP fragments disclosed in this invention. Thus, modulating the splicing, expression, or activity of one or more of these sequences are therapeutic strategies for arthritis, and represent additional embodiments of this invention.

In another embodiment, the disorder is atherosclerosis. Erg-1, which is regulated by HuR, is upregulated in atherosclerotic lesions. These data indicate that Neuronal Hu proteins and homologous proteins may play a role in the etiology of atherosclerosis through regulation of the splicing or expression of any of the genes containing the CLIP fragments disclosed in this invention. Thus, modulating the splicing, expression, or activity of one or more of these sequences are therapeutic strategies for atherosclerosis, and represent additional embodiments of this invention.

In another embodiment, the disease is a metabolic disease. In another embodiment, the disease is diabetes. Nova-2 knockout mice exhibit a diabetes-like phenotype (Example 9).

In one embodiment, the agent modulates the expression or activity of the gene via inhibition or abrogation of binding of a protein to an RNA transcript of the gene comprising a sequence as set forth in SEQ ID No 1-335. In another embodiment, the inhibition or abrogation of protein binding may occur via steric hindrance. In another embodiment, the inhibition or abrogation of protein binding may occur via competitive inhibition. In another embodiment, splicing of the gene is modulated. In another embodiment, export from the nucleus of the gene is modulated. In another embodiment, localization of the gene or its RNA transcript to or close to inhibitory synapses is modulated. In another embodiment, export of the gene from the cell is modulated. In another embodiment, expression level of a protein encoded for by the gene is modulated. In another embodiment, the steady-state concentration of a protein encoded for by the gene is modulated. In another embodiment, activity of a protein encoded for by the gene is modulated.

In one embodiment, “steric hindrance” describes an effect on relative occupancy of a binding site caused by the space-filling properties of those parts of a molecule attached at or near the binding site. For example, the agent described herein may, by binding to the nucleic acid sequence described herein, reduce the rate of occupancy by a natural ligand of a binding site on the nucleic acid molecule. In another embodiment, the agent may completely prevent binding of a natural ligand to the binding site.

In one embodiment, “competitive inhibition” refers to binding of the agent to the RNA transcript of the gene, at a site that overlaps with a site at which an RNA binding protein would otherwise bind, such that binding of the RBP is substantially reduced. In one embodiment, “competitive inhibition” refers to binding of the agent to the RNA transcript of the gene, at a site that is identical to a site at which an RNA binding protein would otherwise bind.

In another embodiment, an agent that modulates the expression or activity of the motifs, sequence fragments, or sequences of the present invention may be useful for treating a disorder characterized by, for example, an absence of, inappropriate, or unwanted expression of a parent gene of a motif, sequence fragment, or sequence. Accordingly, methods for treatment may involve, in one embodiment, contacting a cell of a patient with the motifs, sequence fragments, or sequences of the present invention, their agonists or their antagonists. Each such method represents a separate embodiment of the present invention.

In one embodiment, the motif, sequence fragment, or sequence, agonist, or antagonist may be targeted to a neuron. In another embodiment, the material, agonist, or antagonist may be targeted to neuronal tissue. In one embodiment, the material, agonist, or antagonist may be targeted to the nervous system or a part of the nervous system. Nova and many of the RBPs described herein are preferentially expressed in neuronal tissue such as neurons, and in the nervous system in general. In one embodiment, the material, agonist, or antagonist may be targeted to an inhibitory synapse. In another embodiment, the material, agonist, or antagonist may be targeted to a nucleus. In one embodiment, the material, agonist, or antagonist may be targeted to a cytoplasm. Nova was shown in this invention to be present in inhibitory synapses, nucleus, and cytoplasm of neurons (Examples 6 and 7). It is known to those skilled in the art that many proteins that mediate splicing are present in the cytoplasm.

The term “neuron” refers, in one embodiment, to any cell that functions in the central nervous system or the peripheral nervous system. In another embodiment, the term refers to any cell located in or near tissue of the central nervous system or the peripheral nervous system.

In another embodiment, the present invention provides a kit that comprises the method of assessing a splicing pattern of a gene comprising a nucleic acid sequence as set forth in SEQ ID No 1-335 or 336-449. Kits are packages that facilitate a diagnostic or other procedure by providing materials or reagents needed thereof in a convenient format. Many kits have been successfully commercialized.

Gephyrin is a protein that plays a key scaffolding role in the inhibitory synapse. Gephyrin is essential for the correct localization of GABA_(A) γ2 and GlyRα2 subunits to the synapse, the same subunits whose pre-mRNA splicing is regulated by Nova. Finally, gephyrin, like Nova, has been reported to be the target of a cancer-associated neurologic disorder, SMS, manifest by excess motor activity.

In one embodiment, modulating the expression or activity of a Nova protein may in turn affect the expression or activity of a gephyrin protein or gephyrin RNA. In one embodiment, the splicing, expression, export from the nucleus, localization, or export from the cell of gephyrin RNA may be affected. In another embodiment, the expression, concentration, or activity of gephyrin protein may be affected.

In another embodiment, the invention provides an RBP binding site comprised of nucleic acid having a sequence comprising n repeats of a sequence selected from the group consisting of CCAU, UCAU, UCAC, CCAC, UCCAUC, CCAUCC, AUCCAU, CAUCCA, UCAUCC, CAUCAU, CCAUCU, CCUCCC, CUCAUC, CAUCCU, CUCACC, AUCAUC, CCAUCA, CCCAUC (SEQ ID No 450-467, respectively), where n an integer between 1 and 10, inclusive. Each of these sequences represents a separate embodiment of the present invention. These binding sites may be collectively referred to as a “YCAY motif” SEQ ID No 468, in which “Y” may refer to Cytosine, Uridine, Thymidine, or derivatives thereof. Another embodiment may be referred to as a “YCAYY motif” SEQ ID No 469.

It should be noted that the consensus Nova binding site is similar to the consensus sequence for branch site recognition in animal cells, linking the action of Nova to branch site recognition. Accordingly, this invention includes embodiments in which effects on splicing by Nova or other RBP are mediated by effects on branch point recognition.

In another embodiment, the invention provides an RBP binding site comprised of nucleic acid having a sequence comprising n repeats of a sequence selected from the group consisting of GTTTT, GTTT, CTTTT, CTTT, GTTTC, CTTTC (SEQ ID No 503-508, respectively), where n is an integer between 1 and 10, inclusive.

In one embodiment, n is between 1 and 9. In another embodiment, n is between 1 and 8. In another embodiment, n is between 1 and 7. In another embodiment, n is between 1 and 6. In another embodiment, n is between 1 and 5. In another embodiment, n is between 1 and 4. In another embodiment, n is between 1 and 3. In another embodiment, n is between 1 and 2. In another embodiment, n is between 1 and 5. In another embodiment, n is between 2 and 5. In another embodiment, n is between 3 and 5. In another embodiment, n is between 4 and 5. In another embodiment, n equals exactly 3.

It will be appreciated by one skilled in the art that nucleic acid substitutions, analogues, or derivatives of the binding sites that result in increased binding strength that can be detected by a method of the present invention are also included in the present invention. In another embodiment, sequences highly homologous to the claimed binding sites also form a part of the present invention.

In another embodiment, the invention provides an isolated nucleic acid comprising a sequence set forth in SEQ ID No 63, 64, 76, 77, 78, 84, or 292-335. In another embodiment, there is provided an isolated nucleic acid that comprises a sequence set forth in SEQ ID No 374, 377, 378, 380, 382, 384, 387, 394-396, 415, 416, or 421. In one embodiment, the nucleic acid is RNA or a derivative thereof. In another embodiment, the invention provides an oligonucleotide of at least 15 bases, with a nucleic acid sequence corresponding to SEQ ID NO 1-335, or a complementary sequence thereof. In another embodiment, the invention provides an oligonucleotide of at least 15 bases, with a nucleic acid sequence corresponding to SEQ ID NO 336-449, or a complementary sequence thereof. In one embodiment, the oligonucleotides may be either sense or antisense in orientation. Homologues of the isolated nucleic acid sequences and oligonucleotides are also included in the current invention.

In one embodiment, “oligonucleotides” are short-length, single- or double-stranded polydeoxynucleotides that are chemically synthesized by known methods (such as phosphotriester, phosphite, or phosphoramidite chemistry, using solid phase techniques such as described in EP 266,032 published 4 May 1988, or via deoxynucleoside H-phosphonate intermediates as described by Froehler et al, Nucl. Acids Res, 14: 5399-5407 [1986]). They may then, in one embodiment, be purified on polyacrylamide gels. Oligonucleotides may be composed of any of the embodiments of nucleic acids disclosed herein.

In another embodiment, the invention provides an isolated peptide having an amino acid sequence encoded by a nucleic acid sequence as set forth in SEQ ID No 63, 64, 76, 77, 78, 84, or 292-335. In another embodiment, the invention provides an isolated peptide encoded by a nucleic acid sequence as set forth in SEQ ID No 374, 377, 378, 380, 382, 384, 387, 394-396, 415, or 416, 421. Homologues of the isolated peptides are also included in the current invention.

The RNA sequences, oligonucleotides, and peptides may be used for a wide variety of purposes, including the diagnosis or treatment of a disease or disorder, screening methods to identify bioactive molecules, searches for homologous sequences, scientific experiments, or identification of useful consensus sequences that may not be exactly represented among the sequences. The RNA sequences, oligonucleotides, and peptides may be subcloned into a plasmid or vector, ligated into another molecule, amplified, or subjected to any other procedure for manipulated nucleic acids that is known in the art (see, for example, Molecular Cloning. (2001), Sambrook and Russell, eds.).

In another embodiment, the invention provides a method of modifying an expression profile of a gene of interest, comprising engineering the gene of interest to comprise an RNA motif comprising a sequence as set forth in SEQ ID No 1-335 and 450-469, thereby modifying the expression profile of the gene of interest. In one embodiment, the sequences are inserted alone. In another embodiment, the sequences are inserted together with surrounding sequence. Split up these; also comprising a motif identified by the method of . . . .

In one embodiment, the modification of the gene is tissue specific. In one embodiment, the modification is specific to neuronal tissue. In another embodiment, the tissue-specific modification is specific to for example, the nervous system. The nervous system may refer to, for example, the central nervous system, the peripheral nervous system, a portion of the nervous system, or any combination of these elements. In another embodiment, the modification is not tissue specific. Each of these represents a separate embodiment of the present invention.

A “tissue specific” modification refers, in one embodiment, to a modification that is only manifest in specific tissues. In another manifestation, “tissue specific” modification refers to a modification that is primarily manifest in specific tissues.

In one embodiment, the modification may affect the splicing of an RNA molecule. In one embodiment, the modification may affect the branch point recognition of an RNA molecule In another embodiment, the modification may affect the extent of export of an RNA molecule from the nucleus. In another embodiment, the modification may affect the localization of an RNA molecule to or near the inhibitory synapse. In another embodiment, the modification may affect the extent of export of an RNA molecule from the cell. In another embodiment, the modification may affect the expression of a protein encoded by an RNA molecule. In another embodiment, the modification may affect the steady-state concentration of a protein encoded by an RNA molecule. In another embodiment, the modification may affect the activity of a protein encoded by an RNA molecule. In one embodiment, a parent RNA molecule may be affected in one of the ways mentioned herein.

In one embodiment, the RBP binding site is introduced into an exon of a gene of interest. In another embodiment, the binding site is introduced into an intron of a gene of interest. In one embodiment, the binding site is introduced into a 5′ untranslated region of a gene of interest. In one embodiment, the binding site is introduced into a 3′ untranslated region of a gene of interest.

The binding sites can be inserted, in one embodiment, into a gene of interest by inserting the appropriate nucleotide sequences encoding the desired amino acid sequences to each other by methods commonly known in the art, such as, for example, ligation. The resulting nucleic acid can then be subcloned into an appropriate expression rector as described herein, or can be flanked by sequences that will promote intra-chromosomal insertion (e.g., by homologous recombination or random integration) and introduced into the desired host cell, where it may be expressed.

In one embodiment, the gene of interest is Nova-1, Nova-2, or a homologue thereof. In another embodiment, the gene is not a homologue of Nova-1 or Nova-2. In one embodiment, the gene is a reporter gene.

In another embodiment, the gene may be any nucleotide sequence, the manipulation of which may be deemed desirable for any reason (e.g., treat disease, confer improved qualities, etc.), by one of ordinary skill in the art. Such nucleotide sequences include, but are not limited to, coding sequences of structural genes (e.g., reporter genes, selection marker genes, oncogenes, drug resistance genes, growth factors, etc.), and non-coding regulatory sequences which do not encode an mRNA or protein product (e.g., promoter sequence, polyadenylation sequence, termination sequence, enhancer sequence, etc.). Each sequence represents a separate embodiment of the present invention.

A wide variety of genes can be modified by the binding sites of the present invention such as genes encoding vaccines, antigens, toxic gene products, potentially toxic gene products, and anti-proliferation or cytostatic gene products. Reporter genes can also be modified including enzymes, (e.g. chloramphenicol acetyltransferase, beta-galactosidase, luciferase, beta-glucuronidase, Green Fluorescent Protein, HIS3), fluorescent proteins such as green fluorescent protein, or antigenic markers. Reporter genes, are one embodiment, are genes whose expression can be detected by detection methods known in the art.

In one embodiment, to comprise the nucleic acid sequence or RBP binding site inserted into the gene of interest may compete for a biological factor that binds a different molecule or binding site. When a molecule or binding site interacts with a biological factor that is present at limiting concentrations, it may, in one embodiment, reduce the number of copies of the biological factors that are available to interact with other molecules or binding sites, competing with the other molecules or binding sites for the biological factor. In another embodiment, competing for a biological factor may eliminate other copies of the biological factor that are available to interact with other molecules or binding sites. Thus, competition for a biological factor may reduce or eliminate the number of copies of the biological factor that interact with the other molecule or binding site.

There may be, in one embodiment, another binding site for the biological factor at a different portion of the gene of interest. In another embodiment, there may be another binding site for the biological factor on a different gene. In this case, competition for the biological factor by may reduce or eliminate the number of copies of the biological factor that interact with the alternate site of the gene of interest or other gene. In this way, the engineered binding site may thus indirectly affect the expression pattern of the gene of interest or other gene. In one embodiment, the splicing patter of the gene of interest or other gene may be affected. Competition for binding sites has been shown to affect the activity of the splicing factor Sub2P.

The present invention includes, in one embodiment, the use of the modified gene to diagnose, treat, ameliorate, or prevent a disease or ailment. In one embodiment, the disease or ailment is POMA. In another embodiment, the disease is another neurologic disorder as described herein. In another embodiment, the disease is a non-neurologic disorder. In another embodiment, the disease is an autoimmune disorder as described herein.

In one embodiment, the diagnosis, treatment, amelioration, or prevention of the disease may comprise administering the modified gene in a pharmaceutical composition to a subject in need of such treatment. In another embodiment, the diagnosis, treatment, amelioration, or prevention of the disease may comprise the use of a reporter gene modified as described herein. The use of the modified reporter gene may include, but is not limited to, diagnosis of defects in RNA splicing, export, localization or protein expression, concentration or activity of endogenous genes. Each of these methods represents a separate embodiment of the present invention.

In another embodiment of the invention, the aforementioned vector is introduced into an embryonic cell or other type of cell, for the construction of a transgenic animal to regulate the expression of a transgene in a tissue specific manner.

The modified genes can be introduced into animals by transgenic technology. Animals of any species, including, but not limited to, mice, rats, rabbits, guinea pigs, pigs, micro-pigs, goats, sheep, cattle, chickens, fish and non-human primates, e.g., baboons, monkeys and chimpanzees may be used to generate transgenic animals. The term “transgenic”, in one embodiment, refers to animals expressing coding sequences from a different species (e.g., mice expressing human gene sequences), as well as animals that have been genetically engineered to no longer express, or express inactive versions of, endogenous gene sequences, (i.e., “knockout”, or “null” animals). In transgenic animals that express coding sequences from a different species, as well as in the genetically engineered “knock out” transgenic animals, the altered coding sequences are present in a stably integrated form in their somatic cells, and may also be stably integrated into their germ cell lines so that the altered coding sequences are passed on to their progeny. The present invention encompasses transgenic animals whose progeny contain such stably integrated altered coding sequences as well as transgenic animals wherein the altered coding sequences are stably integrated only in their somatic cells, and therefore not passed on to their progeny. As used herein, “progeny” also refers to subsequent generations of single cells. Methods for the preparation and use of such animals are known in the art. A protocol for the production of a transgenic pig can be found in White and Yannoutsos, Current Topics in Complement Research: 64th Forum in Immunology, pp. 88-94; U.S. Pat. No. 5,523,226; U.S. Pat. No. 5,573,933; PCT Application WO93/25071; and PCT Application WO95/04744. A protocol for the production of a transgenic rat can be found in Bader and Ganten, Clinical and Experimental Pharmacology and Physiology, Supp. 3:S81-S87, 1996. Protocol for the production of a transgenic cow and a transgenic sheep can be found in Transgenic Animal Technology, A Handbook, (1994), Pinkert, Calif. ed., Academic Press, Inc.

Transgenic non-human animals may produced by introducing altered coding sequences into the germ line of the non-human animal. Embryonal target cells at various developmental stages may be used to introduce the altered coding sequences of the invention. Different methods may be used depending on the stage of development of the embryonal target cell(s). Such methods include, but are not limited to, microinjection of zygotes, viral integration, and transformation of embryonic stem cells as described below, and in U.S. Pat. No. 6,613,958 and references cited therein. Each such method represents a separate embodiment of the present invention.

Microinjection of zygotes is one method for incorporating altered coding sequences into animal genomes. A zygote, which is a fertilized ovum that has not undergone cell division, is the preferred target cell for microinjection of transgenic DNA sequences. The mouse male zygote nucleus reaches a size of approximately 20 micrometers in diameter, a feature which allows for the reproducible injection of 1-2 picoliters of a solution containing transgenic DNA sequences. The use of a zygote for introduction of altered coding sequences has the advantage that, in most cases, the injected transgenic DNA sequences will be incorporated into the host animal's genome before the first cell division (Brinster et al., Proc. Natl. Acad. Sci. USA 82:4438). As a consequence, all cells of the resultant transgenic animals (founder animals) stably carry an incorporated transgene at a particular genetic locus, referred to as a transgenic allele. The transgenic allele demonstrates Mendelian inheritance, i.e., half of the offspring resulting from the cross of a transgenic animal with a non-transgenic animal will inherit the transgenic allele, in accordance with Mendel's rules of random assortment.

Viral integration can also be used to introduce the altered coding sequences of the invention into an animal. This method is further described in Jaenich, Proc. Natl. Acad. Sci. USA 73:1260). Introduction of altered coding sequences into germ line cells by this method is possible but probably occurs at a low frequency. However, once a transgene has been introduced into germ line cells by this method, offspring may be produced in which the transgenic allele is present in all of the animal's cells, i.e., in both somatic and germ line cells.

Embryonal stem (ES) cells can also serve as target cells for introduction of the transgenes of the invention into animals, as described in Evans et al., Nature 292:154. Once a transgene has been introduced into germ line cells by this method, offspring may be produced in which the transgenic allele is present in all of the animal's cells, i.e., in both somatic and germ line cells.

In another embodiment, the present invention provides a transgenic mouse comprising a mutation in a Nova-2 gene (FIGS. 4, 9, and 10). In one embodiment, the Nova-2 mutation may alter, diminish, or abrogate expression of Nova-2 RNA or Nova-2 protein. In another embodiment, the mutation may result in production of Nova-2 protein with an altered amino acid sequence. In another embodiment, the mutation may result in production of Nova-2 RNA with an altered nucleotide sequence. In another embodiment, the mutation is a null mutation. In another embodiment, the mutation is a temperature sensitive mutation. In another embodiment, the mutation, or expression of the transgenic Nova-2 protein, is manifest in a tissue-specific manner.

In one embodiment, the transgenic mouse may be generated by the technique of cre-lox recombination (FIG. 9A). In this system, the cre enzyme (also referred to as flp recombinase) is used to catalyze site-specific recombination of homologous sequences between a targeting vector and genomic DNA, such that either the genomic DNA is replaced by sequences from the targeting vector, or sequences from the targeting vector are inserted into genomic DNA sequences. Cre binds to sites known as loxP sites, catalyzing recombination between loxP sites on other molecules, or between multiple sites on the same molecule. “Recombination” refers, in one embodiment, to the breaking and re-joining of strands of DNA to produce a rearranged molecule. The use of cre-lox recombination is known to those skilled in the art (see, for example, Fuchs E C, et al, Proc Natl Acad Sci USA 98:3571-76).

In one embodiment of cre-lox recombination, the cre enzyme may be inserted into the genome of the transgenic mouse. In another embodiment, the cre enzyme may be under the control of an inducible promoter. In another embodiment, the sequence inserted into the mouse's genome may contain one or more selectable markers that confer antibiotic resistance, such as, for example, neomycin resistance. In another embodiment, the selectable markers may be removed subsequent to inserting the sequence into the genome. In another embodiment, cre-lox recombination may be performed on embryonic stem cells, which may, in one embodiment, be subsequently used to generate a transgenic mouse. In another embodiment, insertion of the foreign sequence into the mouse's DNA may be verified by Southern blot (FIG. 9B). Each of these methods represents another embodiment of the present invention.

Nova-2 knockout mice were shown to exhibit splicing defects in GABAy2 RNA (FIG. 10), showing that Nova-1 and Nova-2 are capable of mediating similar functions. In addition, it was shown that the knockout mice had elevated circulating levels of IGF-1, and low levels of serum glucose. This shows that RBP in general, and Nova in particular, have implications in metabolic systems, and potentially those involving autoimmune diseases.

In another embodiment, cells may be isolated from the transgenic animals described herein. Transgenic animals, or cell isolated from them, may be used, in one embodiment, as disease models for POMA or other diseases, or for other purposes apparent to one skilled in the art. In one embodiment, the isolated cells, or their progeny, may be used for one of the purposed described herein. In another embodiment, the isolated cells may be used to generate cell lines, which may be used for one of the purposed described herein. A “cell line”, in one embodiment, refers to a lineage of cells derived from a cell that has been immortalized. An “immortalized cell” refers, in one embodiment, to a cell that has enhanced ability to replicate, or to be propagated in cell culture. Each such use of the transgenic animals or cells derived from them represents an additional embodiment of the present invention.

In another embodiment, them is provided a method for identifying a therapeutic agent for the treatment of a disease or disorder associated with Nova-2 gene expression. In one embodiment, this method comprises the steps of (a) contacting a cell of a transgenic mouse described herein with a candidate therapeutic agent; (b) determining qualitative or quantitative changes in a parameter in the transgenic mouse associated with the disease or disorder in the presence of the candidate therapeutic agent; and (c) determining qualitative or quantitative changes in a parameter in the transgenic mouse associated with the disease or disorder in the absence of the candidate therapeutic agent. If amelioration or abrogation of the disease or disorder is observed in the transgenic mouse, this may indicate a therapeutic effect of the candidate therapeutic agent. In one embodiment, the parameter that is observed may be a symptom, indicator, or manifestation of the disease or disorder of interest.

In another embodiment, there is provided a method for studying a symptom, disease or disorder associated with Nova-2 gene expression, comprising ascertaining the presence or absence of the symptom, disease or disorder in the transgenic mouse described herein. In another embodiment, a quantitative, or qualitative change in the extent of the symptom, disease or disorder may be measured. In one embodiment, the symptom, disorder, or disease may be POMA, one of the diseases or disorders disclosed herein, or an associated symptom. In one embodiment, exhibition of the symptom, disease, or disorder in the transgenic mouse indicates involvement of Nova-2 gene expression in the symptom, disease, or disorder. In another embodiment, if the symptom, disease, or disorder is present in the transgenic mouse, the transgenic mouse may be used as a disease model. Methods for measuring or ascertaining a symptom, disorder, or disease are known in the art, and vary according to the particular symptom, disorder, or disease. In one embodiment, they may comprise direct observation, observation of a change in a behavior pattern, or observation of a physiological change, a histological change, or a manifestation of pain. In another embodiment, the method may comprise administering a test of the animal's fitness, activity level, immunological health, coordination, or mental ability. Each such method represents a separate embodiment of the present invention.

In another embodiment, there is provided a method for identifying a therapeutic agent for the treatment of a disease or disorder associated with Nova-2 gene expression, comprising (a) contacting a cell of transgenic mouse of the present invention with a candidate therapeutic agent; (b) assessing a parameter associated with the disease or disorder in the transgenic mouse in the presence of the candidate therapeutic agent; and (c) assessing a parameter associated with the disease or disorder in the transgenic mouse in the absence of the candidate therapeutic agent, wherein amelioration of the parameter in the transgenic mouse is an indication of a therapeutic effect of the candidate therapeutic agent.

Disease models may be used to test potential therapies or prophylactic measures, or to understand the etiology of a symptom, disease, or disorder. Methods for using mice as a disease model are known in the art, and vary according to the particular symptom, disorder, or disease. Each such method represents a separate embodiment of the present invention.

In one embodiment, the disease or disorder that is studied, or for which a therapeutic agent is identified, is a neurological disorder as described herein. In another embodiment, the disorder is POMA, Multiple Sclerosis, Alzheimer's Disease, Huntington's Disease or Parkinson's Disease. In one embodiment, the disorder is an autoimmune disorder as described herein.

In another embodiment, the method for studying a symptom, disease, or disorder may involve complementing the transgenic mouse with a wild-type Nova-2 gene. “Complementation”, in one embodiment, refers to contacting a transgenic animal, or a cell derived from the animal, with a gene or nucleic acid molecule encoding a wild-type copy of a gene that has been mutated or deleted. Restoration of a phenotypic characteristic that is altered in the transgenic animal may indicate, in one embodiment, that the observed phenotypic characteristic is due to the altered gene. Techniques for complementation of mutated genes are known to those skilled in the art (see, for example, Ikenaka et al, Dev Neurosci. 17:127-36, and referenced cited herein for transgenic animals).

In another embodiment, complementation of the transgenic mouse with a wild-type Nova-2 gene may utilize a vector comprising a wild-type Nova-2 gene under the control of an inducible promoter. “Inducible promoters,” in one embodiment, refer to promoters whose activity can be affected by the presence of one or more factors. Inducible promoters are familiar to those skilled in the art, and are discussed, for example, in Christen et al Transgenic Res. 11:587-95.

In another embodiment, the present invention provides a method of studying a Nova-2 protein function, comprising assessing a qualitative or quantitative change in a parameter in a transgenic mouse of the present invention, thereby studying a Nova-2 protein function.

In one embodiment, the parameter measured may be the pattern of splicing, localization, export from the nucleus, or export from the cell of an RNA molecule. In another embodiment, the parameter measured may be the pattern of expression, concentration, or activity of a protein molecule. Methods for measuring changes in splicing, localization, export, expression, concentration, or activity of an RNA or protein molecule have been described herein. In one embodiment, the changes detected may be a quantitative or qualitative change, increase, or decrease in the magnitude, frequency, duration, consistency, or reproducibility of the parameter.

In another embodiment, the method of determining downstream effects of altered Nova-2 gene expression may involve complementing the transgenic mouse with a wild-type Nova-2 gene as described herein. In this case, a parameter is measured in a wild-type mouse, an uncomplemented transgenic mouse, and a transgenic mouse that has been complemented with the wild-type Nova-2 gene. If a difference is observed between the wild-type and uncomplemented transgenic mice, partial or complete restoration of the wild-type phenotype in the complemented transgenic mouse provides corroborates the conclusion that altered Nova-2 gene expression affects the phenotype. In another embodiment, the complementation of the transgenic mouse with a wild-type Nova-2 gene may utilize a vector comprising a wild-type Nova-2 gene under the control of an inducible promoter, as described herein.

In another embodiment, the method of determining downstream effects of altered Nova-2 gene expression may involve isolating one or more nucleic acid molecules that interact with Nova-2 protein. In one embodiment, the isolation of the nucleic acid molecules may take place by any method known in the art, such as immunoprecipitation or any of the methods described herein. In one embodiment, the immunoprecipitation may be carried out with an antibody or antiserum that recognizes Nova protein as described herein. In another embodiment, the nucleic acid molecules may be purified. In another embodiment, the nucleic acid molecules may be identified. A change in a pattern of splicing, expression, localization, or export from the nucleus or from the cell of the nucleic acid molecules in the transgenic mouse may indicate that expression of Nova-2 protein affects the parameter measured.

In another embodiment, the method of isolating nucleic acid molecules that interact with Nova-2 protein may comprise the CLIP method disclosed herein. The CLIP fragments obtained from the wild-type and transgenic mouse may be compared to ascertain difference between them, the differences indicating an effect of altered Nova-2 gene expression.

In another embodiment, the method of isolating nucleic acid molecules that interact with Nova-2 protein may comprise the use of differential display. Mammals, such as human beings, have about 100,000 different genes in their genome, of which only a small fraction, perhaps 15%, are expressed in any individual cell. Differential display techniques permit the identification of genes specific for individual cell types. Briefly, in differential display, the 3′ terminal portions of mRNAs are amplified and identified on the basis of size. Using a primer designed to bind to the 5′ boundary of a poly(A) tail for reverse transcription, followed by amplification of the cDNA using upstream arbitrary sequence primers, mRNA sub-populations are obtained. Differential display techniques are described in more detail in U.S. Pat. No. 6,623,928 are references cited therein.

In another embodiment, the transgenic mouse may be used to determine whether a test compound exhibits Nova agonist or antagonist activity. This method comprises the steps of: (a) contacting a cell of the transgenic mouse of the present invention with a test compound; (b) contacting a cell of a Nova-2-expressing mouse with the test compound; and (c) determining changes in a parameter associated with Nova activity. In one embodiment, the parameter measured may be the splicing pattern of a gene affected by Nova (Example 4) such as, for example, one of the genes disclosed herein. In another embodiment, the parameter measured may be the pattern nucleic acid molecules isolated by the CLIP method with anti-Nova antisera, or isolated by another method known in the art for isolating nucleic acid molecules associated with a protein. Enhancement of Nova activity may be indicative of agonist activity, and diminution or abrogation of Nova activity may be indicative of antagonist activity.

EXAMPLES Example 1 CLIP Specifically Isolates Covalently Bound RBP-RNA Complexes Materials and Experimental Methods Immunoblot Analysis.

The following antibodies were used: gephyrin (Transduction laboratories), rabbit Nova antiserum (Buckanovich, R. J. et al, Mol Cell Biol 17, 3194), Hsp90 (Transduction laboratories), rabbit brPTB antiserum (Polydorides, A. D. et al, Proc Natl Acad Sci USA 97, 6350), dimethyl-Histone H3 (Upstate Biotechnology). Antibody to neuronal Hu proteins and Human POMA serum were obtained from paraneoplastic neurologic disease patients. BrPTB antibody was used as previously described (Polydorides A D et al, Proc Natl Acad Sci USA 97: 6350-55).

Nova-1 Knockout Mice.

Nova-1 knockout mice were previously described (Jensen K B et al, Neuron 25:359-71).

UV Cross-Linking of Mouse Tissue.

Mouse hindbrain and spinal cord tissue was dissected from 60 postnatal day 4 P8 mice, and rapidly disaggregated in 50-milliliter (ml) polypropylene tubes using a rubber syringe plunger. This tissue suspension was filtered through a 200 micron (μm) nylon filter, the filter was washed with more PBS (total volume was about 100 ml), and flow-through was transferred to 50 ml falcon tubes and spun at 2500 rotations per minute (rpm), 10 minute (min) at 4° Celsius (C). Supernatant was removed and cells resuspended in 80 ml total volume of PBS. Washed tissue material was then resuspended with 1×PBS and placed in 150-millimeter (mm) culture plates (10 ml cell suspension per plate) for irradiation. Irradiation was carried out with a mercury light (maximum emission at 254 nm) to a final energy of 400 millijoule (mJ)/centimeter (cm)².

Suspension was collected, and then each irradiated plate washed with 15 ml fresh PBS. Wash and suspension were combined and pelleted at 2500 rpm, 10 min, 4° C.; then resuspended in 30 ml total volume of PBS. Suspension was distributed to Eppendorf tubes (1 ml per tube), pelleted, and supernatant removed. Pellets were frozen at −80° C. until use.

Immunoprecipitation Solutions

PXL solution comprises 1×PBS (tissue culture grade; no Mg++, no Ca++) or 5×PBS (designated as 1×PXL and 5×PXL, respectively), 0.1% SDS, 0.5% sodium deoxycholate, and 0.5% Noniodet P-40 (NP-40).

1×PNK+ comprises 50 mM tris(hydroxymethyl)aminomethane (tris)-Cl pH 7.4, mM MgCl2, and 0.5% NP-40.

Bead Preparation:

300 microliter (μl) of protein A-Dynabeads stock were used for each Eppendorf tube of cross-linked lysate. Beads were washed 3 times with 1×PXL and resuspended in 150 μl 1×PXL. 60 μl rabbit anti-Nova and protein A-Dynabeads were added to each 200 μl of bead stock. Tubes were rotated at room temperature for 30-45 min and washed 3 times with 1×PXL and 1 time with 5×PXL.

RBP-RNA Complex Isolation

350-400 μl per tube 1×PXL was added to cross-linked lysate, then lysate was incubated on ice for 10 minutes in a total volume of 900 μl. 40 μl RNAsin and 40 μl of Promega RQ1 DNAse was added to each tube, and samples incubated at 37° for 15 min, rotating at 1000 rpm. 4-6 μl of Ambion biochemistry grade RNAse T1 diluted 1:500 in PXL was added, and tubes were incubated at 37° for 10 min. Lysates were centrifuged in pre-chilled micro-ultracentrifuge; 90,000 rpm for 25 min at 4° (polycarbonate tubes in TLA 120.2 rotor). Supernatant was removed, added to a prepared tube of beads, and rotated for 1 hour at 4°. Beads were then washed with ice-cold buffer, using wash volumes of 800-1000 μl, with the following Number of iterations: 2 times with 1×PXL, 1 time with 5×PXL, and 3 times with 1×PNK+. Beads were resuspended in 80 μl of 1×PNK+, and 10 μl of ³²P gamma-ATP and 10 μl of T4 polynucleotide kinase enzyme added. Tubes were incubated in themomixer at 37° and 1000 rpm for 30 min. 5 μl of 100 mM Adenosine TriPhosphate (ATP) was added, and tubes incubated another 5 min. Beads were washed 4 times with 1×PNK+.

RBP-RNA Complex Purification

Washed beads were resuspended in 30 μl of 1×PNK+ and 30 μl of Novex loading buffer, and incubated at 70° C. for 10 min at 1000 rpm. Beads were isolated and the supernatant loaded on a Novex NuPAGE 10% Bis-Tris gel. Each tube was loaded onto 3 wells. After gel run, gel was transferred to BA-85 NC Schleicher & Schuell Bioscience, Inc. (Keene, N.H.) using a Novex wet transfer apparatus. Most of the radioactive signal in the gel was below a MW of about 20-15 kDa, and this portion of the gel was cut off prior to transfer. After transfer, NC filter was rinsed in 1×PBS and gently blotted on Kimwipes; membrane was wrapped in plastic wrap and exposed to film.

RNA Isolation and Purification Solutions

1×PK buffer comprises 100 mM Tris-Cl pH 7.5, 50 mM NaCl, and 10 mM EDTA. 1×PK buffer/7 M urea comprises 100 mM Tris-Cl pH 7.5, 50 mM NaCl, 10 mM EDTA, and 7 molar (M) urea. This buffer should be made fresh.

The piece of NC corresponding to the radioactive band (FIG. 1B) was cut out using a scalpel blade, and cut into pieces as small as possible. This band was positioned 5-10 kDa higher than Nova protein alone, as assayed by Western blotting (data not shown). 4 milligram (mg)/ml proteinase K (prot K) was added to 1×PK buffer and pre-incubated at 37° C. for 20 min to inactivate RNAses. 200 μl of prot K solution was combined with each isolated NC piece in a microcentrifuge tube and incubated 20 min, 37° at 1200 rpm. 200 μl of prot K/7M urea solution was added, and the tubes were incubated 20 min, 37° at 1200 rpm. 400 μl “RNA phenol” and 130 μl of CHCl₃ was added to tubes, which were then incubated at 37° for 20 min at 1400 rpm. “RNA phenol” is pure phenol that has been equilibrated with 0.15 M Sodium acetate (NaOAc) pH 5.2; “CHCl₃” is chloroform at a ratio of 49:1 with isoamyl alcohol. Tubes were spun at 14,000 rpm in a microcentrifuge, and aqueous phase was transferred to empty tubes. 50 μl 3M NaOAc pH 5.2 and 1 ml of 1:1 ethanol:isopropanol (EtOH:isopropanol) was added. Samples were precipitated overnight at −20°

RNA Ligations

RNA was spun down, washed and dried, and counted in scintillation counter (Chrenkov). 20% of sample was set aside as an unligated control.

Directional RNA Ligations:

The RL5 oligonucleotide (SEQ ID No 477) contained a 5′-OH and a 3′-OH group, which allowed it to be coupled to the 5′ phosphorylated end of the RNA fragment. The RL3 RNA oligonucleotide (SEQ ID No 492) contained a 5′ phosphate and a 3′ end blocked with puromycin, and could only link to the tag at the 3′ end. RL5 and RL3 were both obtained from the company Dharmacon. The ligation mixture comprised 1 μl 10×T4 RNA ligase buffer (3U, Fermentas), 0.3 T4 RNA ligase (Fermentas), 1 μl BSA (0.2 mg/ml), 1 μl of 10 mM ATP, and 1 μl RL5 linker at 20 picomole (pmol)/μl, and 5.7 μl H₂O, in which RNA was resuspended. Mixture was incubated at 16° for 60 min, and 1 μl RL3 linker at 40 pmol/μl, 0.5 μl 0.5 ATP, and 0.2 μl T4 RNA ligase was added. Mixture was incubated again at 37° for 30 minutes. The following mixture was then added to the reaction: 77 μl H₂O, 11 μl 10×DNAse I buffer, 5 μl RNAsin, and 5 μl RQ1 DNAse (Promega). Mixture was incubated at 37° for 20 min. The following was then added to the reaction: 300 μl H₂O, 300 μl “RNA phenol”, and 100 μl CHCl₃. The tube was vortexed and centrifuged, and aqueous layer taken. RNA was then precipitated by adding the following: 50 μl 3M NaOAc pH 5.2, 2 μl glycoblue (glycogen) (Ambion), 1 ml 1:1 EtOH:isopropanol, and incubating overnight at −20° C.

Size Separation of RNA

RNA pellets were centrifuged, washed, and dried, and recovery checked by counting in scintillation counter. RNA was resuspended in water and run on 20% denaturing polyacrylamide gel (1:19 acrylamide, 7M urea) along with pre-ligation RNA. Gel was visualized by autoradiography, and 2 fractions of 60-100 nucleotides (nt) and 100-200 nt were removed, placed into Eppendorf tubes with 350 μl of nucleic acid elution buffer, and crushed with a 1 ml syringe plunger. Tubes were incubated at 37° for 30 min at 1200 rpm, then gel slurry was added to a Costar SpinX column containing a 1 cm glass pre-filter. Columns were spun at 14,000 rpm in a microcentrifuge, and supernatant was removed. Nucleic acid elution buffer comprised 1M NaOAc pH 5.2 and 1 mM EDTA. The following was then added, then samples were precipitated overnight at −80° C.: 2 μl glycoblue and 1 ml 1:1 EtOH:isopropanol.

cDNA Synthesis and PCR

RNA was centrifuged, washed, dried, and counted in scintillation counter. RNA was resuspended in 9 μl H2O, and 2 μl of DP3 primer (SEQ ID No 494) at 5 pmol/μl was added. Tubes were heated at 65° for 5 min, chilled, and centrifuged for 10 seconds (sec). The following ingredients were then added: 2 μl 10 mM deoxynucleoside triphosphates (dNTP), 2 μl 0.1 M 1,4-dithio-threitol (DTT), 4 μl 5× SuperScript RT buffer, 0.50 RNAsin, 0.5 μl SuperScriptII (Invitrogen), and tubes were incubated at 55° C. for 30 min, then at 90° for 5 min, then chilled. PCR reaction was performed, using the following reaction components and parameters, (respectively): 4 μl 10×Pfu buffer, 4 μl DP3 primer and 4 μl DP5 primer (SEQ ID No 493), both at 5 pmol/μl, 4 μl 2.5 mM dNTPs, 4 μl radiolabeled DP5 primer at 0.5 pmol/μl, 1 μl Pfu, 3 μl of the RT reaction, and 16 μl H₂O; at 94° C., 30 seconds; 61° C., 30 seconds; 72° C., 30 seconds, for 35 cycles. The DNA primers were provided by Operon.

10 μl of each PCR reaction was nm on a 10% denaturing polyacrylamide gel and visualized by autoradiography. Bands of 60-100 nt and 100-200 nt were excised and eluted with SpinX columns as described, and resuspended in 10 μl of water.

Re-PCR and Cloning

DNA was centrifuged, washed, dried, resuspended in 20 μl H₂O. Each reaction was split into 2 samples, and subjected to PCR using the ingredients 5 μl 10×Pfu buffer, 1 μl 10 mM dNTPs, 1 μl Pfu, 2 μl purified DNA, 37 μl H₂O, and the primer pairs DP5/DP3NheI (SEQ ID No 493 and 496) (reaction A), or DP5EcoRI/DP3NheI (SEQ ID No 495 and 496) (reaction B) with the parameters described hereinabove for 20 cycles. Each reaction was electrophoresed on an 8% denaturing polyacrylamide gel. The major band for each PCR reaction was excised using a UV box, and purified as described hereinabove. RT-PCR was then performed again with the same primer pairs, and the products were electrophoresed on a 4% metaphor agarose gel, and the DNA was purified.

Both reactions A and B were digested using the enzyme NheI incubate for 1 h at 37°, then incubated for 20 min at 70°. Reaction B was additionally digested by adding 1 μl EcoRI and incubated for 1 h at 37°. The reaction was then incubated for 20 min at 70°, Digestion products were desalted using a G25 column.

Ligation

The following mixture was incubated 3 h at 16° C.: 7 μl 10× ligation buffer, 3 μl T4 DNA ligase, 30 μl of digested tags A, and 30 μl of digested tags B. The mixture was then incubated 20 minutes at 60° to inactivate the ligase, desalted using an S200 column, and dried in a speedvac to 20 μl final volume. The DNA was purified by electrophoresing 20 μl of ligation product on a 2% agarose gel. Ligation was confirmed by visualization of bands 1, 2, 4, 6, and 8 times the size of the initial PCR product. Visible bands of 400-800 nucleotides length were excised, purified from the gel, and the purified DNA resuspended in 30 μl of EB.

TOPO Cloning and Sequencing

A 3′ A end was generated using the following protocol: The following mixture was incubated at 72° for 20′: 3.5 μl DNA, 0.5 μl 10×Taq buffer, 0.5 μl 10 mM dATP, and 0.5 μl Taq polymerase (5U). The mixture was then placed on ice and used immediately in the TOPO cloning reaction, The following reagents were combined, mixed gently and incubated 5 min at room temperature: 2-4 μl DNA, H₂O to 4 μl, 1 μl salt solution, and 1 μl pCR4-TOPO vector. 2 μl of reaction was then added to a vial of Top10 competent cells, which were transformed by incubating 10 min on ice, then 30 sec at 42°, then 2 min on ice. 250 μl SOC medium was added, and the cells incubated for 1 hr, shaking, at 37°. 10-50 μl of the cell suspension was spread on ampicillin plates (containing 40 μl each of IPTG & X-gal stock solutions (x-gal stock consisted of 400 mg X-Gal in 10 ml dimethylformamide and was stored at −20° C.).; IPTG stock consisted of 238 IPTG/10 ml distilled and purified H2O, and was filter sterilized, and stored at 4° C.). Miniprep DNA was prepared from white colonies following overnight incubation.

Sequencing

Inserts were then sequenced, using M13F primer (SEQ ID No 509) (custom ordered from Operon). 16 μl of miniprep DNA and 2 μl primer (diluted to 5 μM; 10 pmolar final concentration) was mixed and submitted to the facility.

Linker and Primer Sequences

The following RNA linkers were used in the initial ligation step: RL5 (5′-OH AGG GAG GAC GAU GCG G 3′-OH) (SEQ ID No 477) and RL3 5′-P CGA GAU GGC GGC UUC CUG C 3′-puromycin (SEQ ID NO 492).

The following DNA primers were used in RT-PCR and PCR: DP5 DNA primer: AGG GAG GAC GAT GCG G (SEQ ID NO 493), DP3 DNA primer: GCA GGA AGC CGC CAT CTC G (SEQ ID NO 494), DP5EcoRI: GAA TTC AGG GAG GAC GAT GCG G (SEQ ID NO 495), DP3Nhe1: GCT AGC AGG AAG CCG CCA TCT CG (SEQ ID NO 496). The following DNA primer was used in sequencing: M13F: GTAAAACGACGGCCAG (SEQ ID No 509).

Primers were gel purified.

Results

The protocol for purification of RNA molecules binding to Nova-1 or Nova-2 protein was depicted in FIG. 1A. Cross-linked cells from mouse brain tissue were collected, lysed, digested with DNAseI (not depicted) and limiting amounts of RNAse T1 and immunoprecipitated with anti-Nova antiserum. Immunoprecipitated RBP-RNA complexes were washed, and remaining RNA labeled with γ-³²P ATP. Immunoprecipitated material was resolved by SDS-PAGE, then transferred to NC. RBP-RNA complexes were extracted from the NC, and RNA was purified by digestion of the protein, then directionally ligated to two RNA oligonucleotides. Fragments were then subjected to RT-PCR, ligation into cloning vectors, and sequencing.

When CLIP was used to identify Nova-RNA complexes from mouse brain, RNA was co-purified with Nova only following UV-B irradiation; in the absence of cross-linking, or when pre-immune rabbit serum was used for immunoprecipitation, no RNA co-purified with Nova (FIGS. 1, B and C). As an additional control to assess the specificity of the interaction, CLIP was performed using WT versus Nova-2^(−/−) brain, with an antibody that recognizes both Nova-1 and Nova-2 proteins. In WT brain, cross-linked bands migrating with both Nova-1 and Nova-2 proteins were evident, but in Nova-2^(−/−) brain, RNA cross-linked specifically to protein corresponding to Nova-1, while the band migrating at the molecular weight of 70 kDa, corresponding to Nova-2 isoform, was lost (FIG. 1C). Thus the ability of CLIP to isolate RNA was dependent on cross-linking and on the presence of immunoprecipitated Nova protein. These controls demonstrated that only RNA molecules that were covalently bound to Nova were detected using the CLIP method with anti-Nova antiserum. The radioactive band on the gel was positioned 5-10 kDa higher than Nova-1 without cross-linked RNA, as assayed by Western blotting (data not shown).

Example 2 Monitoring of Linker Ligation and RT-PCR of RNA Molecules

³²P-labeled RNA was size purified after cross-linking and purification from N2A cells using anti-Nova antiserum as described in Example 1. The size of the RNA fragments ranged from 24-150 bases; the modal size of the RNA was approximately 60 bases (FIG. 2A). Purified RNA fragments were ligated to 5′ and 3′ linker oligonucleotides (“linkers”), which added 16 bases to each end of the molecule. The majority of the labeled fragment RNA shifted in size by 32 bases, indicating successful ligation (FIG. 2B). RNA isolated from regions 1 and 2 in FIG. 2B was amplified by RT-PCR with specific primers complementary to the linker sequences (FIG. 2C). The prominent band at 32 bases was the product from the ligation of the two RNA oligonucleotides without insert. The products in (C) were further divided and further amplified by PCR (FIG. 2D).

Example 3 CLIP Method Using Nova-1 Antisera Enabled the Isolation and Identification of Sequence Fragments Containing Nova-1 Binding Sites Materials and Experimental Methods Computer Analysis of Nova CLIP Fragments.

3400 control fragments were randomly, generated by a computer program from a 200,000 nucleotides long sequence consisting of 66% intronic, 14% exonic and 20% 3′UTR sequences (corresponding to the ratio in Nova CLIP fragments) from random genes on mouse chromosome 1, such that they corresponded in their size to Nova CLIP fragments (with the average size of 71 nucleotides). Another program was made to count the number of particular polynucleotide (up to 20 nucleotides in a row) in each fragment, and calculate the frequency of fragments carrying a certain number of that polynucleotide (for example, YCAY, where Y represents either U or C). An additional program was made to calculate the average frequencies of nucleotides at three positions flanking a particular dinucleotide (CA in our case) in all fragments.

Nova-2 Protein Purification.

6×His-Nova-2-T7 protein was expressed in E. coli and purified with successive Chelating Sepharose fast flow column (Amersham 17-0575-01) and T7-fragment antibody agarose (Novagen, 69026).

Transcription of Oligonucleotide Templates.

The PCR products for each of the four tested CLIP fragments and genomic controls were annealed to the oligonucleotide 5′-AGTAATACGACTCACTAFRAGMENT-3′ for transcription with T7 polymerase (Promega), and RNA synthesis carried out by using α-³²P-UTP in standard transcription buffer (Promega). Transcripts were size-purified by using 20% denaturing PAGE.

Measurement of RNA-Protein Binding.

Binding dissociation constants were measured by a nitrocellulose filter binding assay (Carey, J et al, Biochem 22, 2601). 50-μl reactions containing 50-100 femtomole (fmol) of RNA internally labeled with ³²P and concentrations of Nova-2 in 3-fold dilutions typically ranging from 0.2 nM to 493 nM were mixed in 1×BB (a buffer containing 50mM TrisOAc pH 7.7, 200 mM KOAc, 1 mM MgOAc, 1 mM DTT, 0.2 mg/ml heparin) and were incubated at 10 min for 25° C., followed by filtering and washing. Dissociation constants were determined graphically by plotting the fraction of bound RNA versus the log of the protein concentration (Irvine, D. et al, J Mol Biol 222: 739).

Results

An unbiased screen was performed to identify Novan RNA targets in vivo. 340 Nova CLIP fragments were sequenced (FIG. 3A), which had an average length of 71 nucleotides. An annotated list of the CLIP fragments is shown in FIG. 3H, which have been assigned SEQ ID No. 1-335. The largest set of CLIP fragments (121) were within long introns; 99 (82%) of these were within the first 3 introns. 68 were found within introns shorter than 10 kb, and 18 (26%) of these flanked alternative exons. 58 were within 3′ UTRs and 38 within exons, 2 of which were previously reported to be alternatively spliced. The sequences of the Nova CLIP fragments were compared with the known Nova binding element, multimers of the YCAY tetramer. Tetramer frequency analysis revealed that on average, each tag harbored 4.2 YCAY tetramers, compared to an expected frequency of 1.1 YCAY tetramers in random sequences, and the observed frequency of 1.1 and 1.7 in random genomic sequences and CLIP fragments obtained with an unrelated RBP (FIGS. 3B and 3G). Moreover, analysis of nucleotide frequencies flanking all CA dimers present in Nova CLIP fragments showed an overrepresentation of YCAU tetramers flanked by pyrimidines (FIG. 3C), which was also evident in the 5 most frequent hexamers, which were overrepresented 15-30 times; no such increase was evident in analysis of control tags (FIG. 3D). Filter binding assays were performed using purified Nova-2 protein and RNA transcribed from 4 different Nova CLIP fragments (FIG. 3E, F). All four RNAs bound Nova-2 with high affinity (FIG. 2E; 23 to −400 nanomolar [nM] affinity). Control tags of genomic sequence immediately 5′ to the CLIP fragments did not bind, nor did CLIP fragments in which CA dinucleotides were mutated to AA (not shown). Thus, Nova protein was selectively cross-linked to high affinity RNA targets harboring Nova binding sites. The CLIP fragments identified reflected the (UCAUY)₃ and YCAYC motifs observed in RNA selection experiments performed with Nova, and with functional studies demonstrating that Nova regulates alternative splicing by binding clusters of at least three intronic UCAU tetramers in target RNAs. Thus, CLIP fragments have been verified using several stringent tests: sequence comparisons with known Nova binding sites and demonstration of direct RNA-protein interactions. Taken together, this data further validates CLIP as a method of specifically purifying RNA sequence fragments directly binding to a protein of interest. Additionally, it delineates more than prior art the role of sequence surrounding YCAY tetramers in affecting Nova binding.

Example 4 Nova Regulates Alternative Splicing and Protein Expression of Jnk2, Neogenin, and Gephyrin Materials and Experimental Methods

RT-PCR analysis.

PCR of mouse JNK2 was performed at annealing temperature of 60° C. and 26 cycles, with primers F, 5′-TGATGACTCCCTATGTGGTAACTCG (SEQ ID No 497) and R, 5′-TCTCTGGCTTGACTTGTTTTTATTTTG (SEQ ID No 498), and PCR products were digested with AluI, which has sites in exons 6b and 7. PCR of mouse neogenin was performed at annealing temperature of 61° C. and 23 cycles, with primers F, 5′-ACACTGGCTGGAAGGAGGGG (SEQ ID No 499) and R, 5′-TGGGCTGTGGGAAGACTCTGG (SEQ ID No 500). PCR of mouse gephyrin was performed at annealing temperature of 61° C. and 23 cycles, with primers F, 5′-TGTGGAATAAGGGGGAAAACTCTG (SEQ ID No 501) and R, 5′-TCGTGGGAGCACCTGAACAC (SEQ ID No 502). Clontech first strand cDNAs were used for analysis of splicing in mouse tissues.

Results

Nova-2^(−/−) mouse brain was used to validate candidate RNAs by assessing utilization of alternatively spliced exons present near Nova CLIP fragments. Of 18 RNAs assayed, seven showed changes in alternative splicing in Nova-2-null mouse brain ranging from 1.6-fold to 60-fold (FIG. 4A). JNK2 is a cytoplasmic signaling protein that translocates to the nucleus to phosphorylate and activate several transcription factors including ATF2 and c-Jun. The alternatively spliced exons 6a and exon 6b encode isoforms that preferentially bind ATF2 or c-Jun, respectively, and are preferentially included in brain and non-neuronal tissues, respectively. A Nova CLIP fragment was identified in JNK2 pre-mRNA, near exon 6b (FIG. 4A). RT-PCR analysis revealed a net 6-fold change in exon utilization in Nova-2^(−/−) relative to Nova-2^(+/+) cortex: a 3-fold decrease in utilization of the exon 6a isoform, and a 2-fold increase in exon 6b in JNK2 RNA (FIG. 4A).

Neogenin, a homologue of DCC (Deleted in Colorectal Cancer), has been reported to bind netrin-1, although its role in axon guidance has not yet been fully elucidated. Neogenin is expressed in all adult mouse tissues, and has four alternative exons, one of which, exon 27, contains a Nova CLIP fragment. Splicing of all four alternative exons was assayed in RNA obtained from the cortex of Nova-2 null mice. Alternative splicing of exon 27 was drastically altered relative to wild-type brain, such that there was ˜36 fold increase in utilization of the exon 27 in RNA from Nova-2^(−/−) relative to Nova-2^(+/+) cortex (FIG. 4B). In contrast, there was no change in utilization of the other three alternatively spliced neogenin exons in Nova-2^(−/−) relative to Nova-2^(+/+) cortex, consistent with previous observations that Nova regulates splicing in only a subset of regulated exons.

Two Nova CLIP fragments were identified in gephyrin, one in intron 7, near the alternatively spliced exon 9, and a second in intron 14 (FIGS. 4C, 3H). In wild-type mouse brain, gephyrin transcripts preferentially excluded exon 9 (96%). In Nova-2^(−/−) mouse brain, only 27% of gephyrin transcripts excluded exon 9, and there was a compensatory increase in exon 9 inclusion (73% vs. 4% in wild-type cortex; FIG. 4C). Each of the seven gephyrin exons reported to be alternatively spliced were examined and it was found that only exon 9 was regulated by Nova (FIG. 4C). Thus, the presence of Nova in neurons correlated with alternative exon skipping in gephyrin and neogenin transcripts.

Changes in gephyrin protein isoform expression in Nova-2^(−/−) cortex were also detected by Western blot (FIG. 4C), and these were consistent in magnitude with the changes seen gephyrin transcripts. Gephyrin exon 9 utilization in different tissues was surveyed by RT-PCR analysis, which revealed that gephyrin transcripts in non-neuronal tissues include 22-fold (testis) to 115-fold (heart) more exon 9 than brain (FIG. 4D), which is similar in scale to the 60-fold change seen in Nova-2^(−/−) versus wild-type cortex (FIG. 4C).

Gephyrin RNA is a particularly interesting target for Nova action. Gephyrin is essential for the correct localization of GABAA γ2 and GlyRα2 subunits to the inhibitory synapse, and Nova regulates alternative splicing of transcripts encoding both of those receptors. Like Nova, gephyrin has been reported to be the target of a cancer-associated neurologic disorder manifest by excess motor activity. Finally, Nova-dependent regulation of a network of RNAs encoding proteins that mediate neuronal inhibition correlates with the defective motor inhibition in Nova^(−/−) mice and in POMA patients.

(These observations indicate that Nova acts as a critical factor determining specificity of gephyrin alternative splicing in neurons, an observation not previously made in vivo with vertebrate splicing factors.

To summarize, the data in FIG. 4 demonstrate that Nova regulates alternative splicing (6-60 fold effects) of several RNAs identified by the CLIP method. Our results indicate that Nova may the primary, if not sole, determinant of brain-specific alternative splicing of these, and perhaps other, transcripts. These findings provide further evidence that the locations of Nova CLIP fragments are able to predict functional binding sites. In addition, these findings demonstrate the ability of CLIP to identify previous unknown RNA targets that are regulated by an RBP in vivo. In addition, the data in FIG. 4 A demonstrate a unique role specifically for Nova regulation of JNK2 expression.

Example 5 Nova CLIP Fragments Fall within Several Known Genes Involved in Synaptic Function, Signaling, and Protein Synthesis

In addition to gephyrin RNA, a number of RNAs were identified multiple times within the set of 340 Nova CLIP tags, suggesting that these might be a particularly robust subset of RNA targets. 77 CLIP tags (23%) mapped to only 34 transcripts, each of which contained 2 or more tags. 21 of these 34 transcripts correspond to characterized genes (FIG. 5A); 15 (71%) of these encode proteins that function in the synapse, which indicates that Nova may coordinately regulate a biologically coherent set of RNAs.

An intriguing subset of Nova target RNAs involved in synaptic biology are those involved in neuronal inhibition (FIG. 5B). These include the microtubule-associated protein MAP1b, which anchors GABAc receptors to the cytoskeleton and modulates their sensitivity; GABA_(b) 2 receptor and GIRK2, which mediate slow inhibitory postsynaptic potentials; the K+ voltage-gated channel KCNQ3, which mediates inhibition of repetitive action potentials; the nicotinic acetylcholine receptors β2 and α2, which contain CLIP tags at homologous positions in exon 5 and are together as α4β2 heteropentamers highly expressed on GABAergic interneurons, thus influencing inhibitory activity; and the INK proteins (FIG. 4A), which are essential for neuronal microtubule integrity by controlling phosphorylation of MAP1b and MAP2, and for the regulation of GABA action in C. elegans inhibitory motor neurons.

Example 6 Somatodendritic Nova Simultaneous Detection with Gephyrin in the Postsynaptic Cytoplasm Materials and Experimental Methods Nuclear/Cytoplasmic Fractionation Method.

Brain tissue was Dounce homogenized in cold 10 mM 2-[4-(2-hydroxyethyl)-1-piperazinyl]ethanesulfonic acid (HEPES) (pH 7.9), 10 mM NaCl, 1.5 mM MgCl2, 0.2% Triton, 10 mM NaF, protease inhibitors (Roche), spun at 3000 times gravitational force (x G) for 3 min. Supernatant was collected as cytoplasmic fraction, and pellet was resuspended in 20 mM HEPES (pH 7.9), 25% glycerol, 1.5 mM MgCl2, 1.4 M KCl, 0.2 mM EDTA, 0.5% NP-40, 10 mM NaF, protease inhibitors (Roche), 5% DNAse, incubated 5 min at 37° C., dialyzed against 1×PBS, pH 7.4, 1.5 mM MgCl₂, 0.5% NP-40, mM NaF, and collected as nuclear fraction. Both fractions were ultracentrifuged at 100000×G for 30′.

Tissue preparation.

Adult Sprague-Dawley rats (Janvier, France) were deeply anaesthetized with pentobarbital (60 mg/kg body weight, i.p.), and intracardially perfused. For fluorescent immunocytochemistry and in situ hybridization (ISH), animals were perfused with 4% paraformaldehyde (PFA) in phosphate buffer saline (PBS) (0.1M, pH 7.2). For electron microscopy (EM), immunocytochemistry, and ISH, animals were perfused with 4% PFA and 0.1% glutaraldehyde in PBS. Spinal cords were removed and postfixed in 4% PFA in PBS overnight at 4° C. Spinal cord sections were cut on a vibratome and collected in PBS.

Fluorescent Immunocytochemistry on Spinal Cord Sections.

Spinal cord 30 μm sections were rinsed in 50 mM NH₄Cl in PBS for 15 minutes (min) and permeabilized with 0.1% Triton X-100, 0.1% bovine gelatin in PBS for 10 min. Free-floating sections were incubated with primary antibodies, in 0.1% Triton X-100, 0.1% bovine gelatin in PBS, overnight, at 4° C.). The following day, sections were rinsed three times in PBS (10 min each) and revealed by the corresponding secondary antibodies in PBS, 2 hours at room temperature (RT). After three washes in PBS (10 min each), sections were mounted on slides with Vectashield (Vector Lab.).

Fluorescent Non-Radioactive In Situ Hybridization and Immunocytochemistry on Spinal Cord Sections.

Fluorescent in situ hybridization (FISH) was as previously described. Digoxigenin labeled probes and Nova proteins were labeled at the same time (in 100 mM Tris-HCl pH 7.5, 150 mM NaCl, 2% Bovine Serum Albumin (BSA), 0.3% Triton X-100, overnight, 4° C.). The anti-digoxigenin and anti-Nova-1 primary antibodies were detected by incubating sections with the appropriate secondary antibodies (in PBS, 2 hs at room temperature). Each incubation was followed by three washes in PBS (10 min each). Finally, sections were mounted on slides with Vectashield (Vector Lab.)

Image Acquisition.

Sections processed for fluorescent immunocytochemistry and ISH were observed with an epifluorescent Zeiss microscope, or a Leica confocal laser scanning microscope. For confocal images, background noise was reduced by applying a Gaussian filter to the optical sections.

Electron microscopic immunocytochemistry. 100 μm thick vibratome sections were cryoprotected in 20% glycerol-20% sucrose in PBS, and permeabilized by freezing and thawing. Sections were collected in PBS, rinsed in 50 mM NH₄Cl in PBS for 15 min, and 0.1% bovine gelatin in PBS for 10 min. The free-floating sections were incubated with the primary antibodies (in 0.1% bovine gelatin in PBS, overnight, at 4° C.). The following day, sections were rinsed three times in PBS (10 min each) and incubated with the secondary antibodies (in PBS-1% BSA for biotinylated antibodies, 2 hr at RT; or PBS-0.2% fish gelatin for gold antibodies, overnight at 4° C.). Biotinylated antibodies were revealed with the ABC Elite kit (Vector Lab) in PBS, for 1 hour at RT, and the peroxidase reaction was carried out in the presence of DAB and hydrogen peroxide (Sigma Fast, Sigma). Nanogold-coupled antibodies were amplified as described.

Electron microscopic pre-embedding non-radioactive in situ hybridization and immunocytochemistry. 50 μm sections were cryoprotected and permeabilized as for EM immunocytochemistry. Prehybridization and hybridization were as described above for fluorescent ISH. After the stringency washes, sections were rinsed in PBS and incubated in the primary antibody (1% BSA in PBS, overnight, at 4° C.). After three PBS rinses (10 min each), digoxigenin molecules and the primary antibody were detected by gold- and HRP-coupled antibodies, respectively (in 0.8% BSA, 0.2% Fish Gelatin in PBS, overnight, 4° C.). After three PBS rinses (10 min each) sections were incubated in 4% PFA in PBS (10 min), rinsed three times in PBS (10 min each) and several times in cold distillated water. Gold-coupled sheep anti-digoxigenin secondary antibodies were detected by a silver enhancement-gold toning protocol as described before. After three PBS rinses (10 min each), biotinylated antibodies were detected by peroxidase-DAB reaction as in classical immunocytochemical methods.

The sections processed for immunocytochemistry and ISH were dehydrated, osmicated and flat embedded in araldite (Fluka) resin. Ultrathin sections were prepared, mounted in copper grids and contrasted with uranyl acetate and lead citrate before examination under a Jeol CX H transmission electron microscope at 80 kilovolts (Kv).

Results

Seven of the 34 transcripts with multiple Nova CLIP fragments harbor one or more fragments within 3′ the UTR (FIG. 3A), and two of these seven, MAP1B and KCNQ3, encode proteins that function in neuronal inhibition (FIG. 5B), indicating that part of Nova's role in regulating alternate splicing may be manifest in the dendrite or the inhibitory synapse. Fractionation of mouse brain into nuclear and cytoplasmic fractions revealed that two thirds of total Nova protein is present outside of the nucleus, although when normalized to protein mass, the highest concentration of Nova protein is in the nucleus (FIG. 6A). Immunofluorescence of rat spinal cord sections confirmed an abundance of Nova immunoreactivity outside of the nucleus in mouse motor neurons, including punctate reactivity in neuronal processes (FIG. 6C; see also FIG. 7A). To assess whether this reactivity corresponds to the localization of inhibitory synapses, we examined whether Nova and gephyrin reactivity co-localize, with gephyrin serving as a marker for localization to inhibitory synapses. Both immunofluoresence and electron microscopic evaluation demonstrate that Nova protein is present in the inhibitory synapse, in the vicinity of gephyrin protein (FIG. 6B-D). These data suggest a model in which Nova may regulate mRNAs within the dendrite, either by affecting their localization, translation, or half-life.

Example 7 Colocalization of Nova Protein and GlyRα2 mRNA in Motor Neurons

To examine the role of Nova in the regulation of RNA within the dendrite, Nova protein and GlyRα2 mRNA co-localization in mouse spinal cord motor neurons was determined by fluorescent and electron microscopy, and revealed that Nova protein co-n localizes with GlyRα2 mRNA in the dendrite (FIG. 7A-F). Interestingly, both Nova protein and GlyRα2 mRNA were associated with membranous bodies suggestive of endoplasmic reticulum (FIG. 7C-F). Taken together, the data from Examples 6-7 show that at least a subset of Nova localizes to the inhibitory synapse. This suggests that Nova may coordinately regulate RNA information processed in the nucleus with RNA expression locally in the synapse. Since a hallmark feature of Nova antisera from POMA patients is that they abrogate RNA-protein interactions in vitro, these results suggest that disruption of such interactions, perhaps within the dendrite, may contribute to disease pathogenesis. Furthermore, the finding that Nova, gephyrin, and at least one regulated spliced mRNA, GlyRα2, localize to the inhibitory synapse further validates the CLIP method as a means of identifying RNA sequences that interact with a protein of interest.

Example 8 CLIP Method Using Antisera to Neuronal Hu Proteins Enabled the Isolation and Identification of Sequence Fragments Binding to Neuronal Hu Proteins Materials and Experimental Methods

All experimental procedures were conducted as in Example 3, with the exception that antiserum against neuronal Hu proteins was utilized in place of anti-Nova serum. Sequences were aligned with the aid of the clustalw software package.

Results

115 RNA fragments binding to Neuronal Hu proteins were sequenced and assigned SEQ ID No. 336-449 (FIG. 8A). As was the case with CLIP using anti-Nova-1 antisera, some of the RNA fragments corresponded to previously known proteins. Some of the fragments were located in introns of the genes (35%), others were in the 3′ and 5′ untranslated regions (45%), and in others (20%), the location could not be determined because the gene had not been identified. Obtained sequences were aligned using the clustalw website (FIG. 8B). A consensus sequence of the fragments was then determined, as depicted at the top.

Example 9 Generation and Characterization of Nova-2 Knockout Mice Materials and Experimental Methods Generation of Nova-2^(−/−) Mice.

A Nova-2 lambda clone was isolated from an SV-129 mouse genomic library. To prepare a targeting construct, a left arm was generated by cloning a XbaI and HindIII genomic fragment. This fragment was cut with XhoI in the center and digested with ExoIII to remove the endogenous Methionine. This construct removed 45 nt of sequence upstream of the initiator methionine. This fragment was then digested with XbaI to remove remaining 3′ DNA, ends were blunt ended with Klenow fragment, and the fragment was ligated to create the short arm (pΔ1.1). pΔ1.1 was cut with BamHI and Sad (in the multiple cloning site) and ligated to SacI/BamHI adapters harboring a PacI site. The right arm was generated from a 6.0 kb Sad fragment, in which an internal Pad site was ablated by Klenow fragment treatment and re-ligation. This SacI (ΔPacI) fragment was inserted into the pΔ1.1 SacI adaptor. The PacI site was then used to insert an IRES-Cre-Flip-Neo-Flip cassette. This targeting vector was electroporated into ES cells, G418 resistant clones selected, and clones screened by Southern blot using a HindIII-BamHI 950 bp genomic fragment located immediately upstream of the left arm. Chimeric animals were bred to C57B1/6 mice and agouti offspring genotyped and then bred to a transgenic mouse expressing Flip-recombinase to remove the Neo cassette. Heterozygous lines were outbred to C57B1/6, CDI and FVB strains.

Results

The function of Nova-2 protein was explored by generating Nova-2 null mice. While the neurologic disorder POMA is characterized by dysfunction of inhibition of motor systems, in up to 58% of patients progressive multifocal neurologic deficits develop, including encephalopathy and dementia with cerebral atrophy. Since POMA antisera are reactive against all CNS neurons, and Nova-2 is largely or exclusively expressed in neocortex and hippocampal neurons, it seems possible that immune targeting against Nova-2 may lead to disease in some POMA patients.

To generate Nova-2^(−/−) mice, a targeting vector was constructed, consisting of genomic SV129 DNA fragments of 1.1 kB and 6 kB flanking a ˜1.5% B DNA fragment harboring the first known transcribed Nova-2 exon. An IRES-Cre FLIP-Neo-FLIP cassette was inserted into these anus such that it would be inserted into the first Nova-2 exon upstream of the ATG encoding the putative initiator methionine. (FIG. 9A). Following electroporation into ES cells and selection with G418, clones harboring homologous recombinants were screened by Southern blot (FIG. 9B) and injected into blastocysts. Following breeding of chimeric mice into the germ line, mice were bred with CMV-Flp recombinase transgenic mice to remove the neomycin cassette, generating mice heterozygous for a null Nova-2 allele.

To confirm that these mice did not express Nova-2 protein, Western blots using extracts from several brain areas of Nova-2^(−/−), Nova-2^(+/−), or WT littermates were probed with POMA antisera. In each, case, Nova-2^(+/−) or Nova-2^(−/−) brain showed reduced or absent expression, respectively, of Nova-2 protein isoforms. Interestingly, these included both the single 55 kD Nova-2 protein species previously described, and a series of previously described but poorly understood protein isoforms of higher apparent molecular weight (approximately MD) recognized by POMA antisera (FIG. 9C). These observations confirm that the Nova-2 targeting allele eliminated expression of any intact Nova-2 protein isoforms recognized by POMA antisera, and further implicate the higher MW POMA reactive proteins as products of the Nova-2 locus.

Nova-2^(−/−) mice were phenotypically indistinguishable from their WT littermates at birth. Within the first postnatal week, however, these mice typically grew less than their WT littermates (FIG. 9D), a finding also seen in Nova-1^(−/−) mice. By 1-2 weeks postnatal, the mice were less active, and by 2-3 weeks half of the Nova-2^(−/−) mice died (FIG. 9E). The knockout mice also had abnormally high circulating levels of IGF-1 and abnormally low levels of serum glucose (data not shown).

Example 10 Alternate Splicing Defects in Nova-2 Knockout Mice

Nova-1^(−/−) mice show specific defect in inclusion of GABAA γ2 L and GlyRα2 E3A exons in the spinal cord and hindbrain. To evaluate whether Nova-2 may mediate similar functions in mouse brain, alternative splicing of these exons in Nova-2^(−/−) neocortex was analyzed. Inclusion of γ2 L exon of GABAA Nova-2^(−/−) was significantly reduced in neocortex of the knockout mice, and this effect was dose-dependent in the GABAA pre-mRNA, as seen in Nova-1^(−/−) mice (FIG. 10A). This effect was specific, as shown by the finding that no changes in splicing of alternatively spliced neuronal exons in the proteins src, ced-3 homologue (ICH-1), or Nova-1 were evident in Nova-2^(−/−) neocortex (FIG. 10B). In addition, while all regions of the brain showed differences in alternative splicing, the differences were most prominent in the neocortex, an area known to express high levels of Nova-2 protein and little or no Nova-1 protein (data not shown).

Since a defect in GlyRα2 E3A exon inclusion is present in Nova-1^(−/−) mouse hindbrain, but not neocortex, splicing of this transcript was assayed in Nova-2^(−/−) mice. A significant deficiency in inclusion of exon E3A in neocortex of the Nova-2^(−/−) mice was observed, of a similar magnitude to that seen in Nova-1^(−/−) hindbrain (FIG. 10B). This effect was not evident in spinal cord, and there was not a significant effect in cerebellum; this relative specificity for neocortex was thus the reciprocal of the splicing defect seen in Nova-1^(−/−) mice, which was specific to the spinal cord, and absent in cortex (data not shown). Taken together, these results are consistent with the pattern of Nova-1 and Nova-2 expression, and suggest that either protein is able to regulate alternative splicing in a similar manner, but in a different range of neuronal cell types. 

1-174. (canceled)
 175. A method for identifying an RNA molecule interacting with an RNA binding protein of interest in a biological sample, said method comprising the steps of: a. irradiating said biological sample to create a covalent bond between said RNA molecule and said RNA binding protein of interest, thereby generating a covalently bound RNA binding protein-RNA complex containing said RNA molecule, wherein said biological sample is selected from the group consisting of whole tissue biopsy, tissue sample biopsy and whole organ; b. cleaving said RNA molecule by contacting said RNA binding protein-RNA complex with an agent capable of cleaving a bond thereof, thereby generating a fragment of said RNA molecule that is bound to the RNA binding protein, wherein said fragment is a transfer RNA, a small nuclear RNA, a ribosomal RNA, a messenger RNA, an anti-sense RNA, a small inhibitory RNA, a micro RNA or a ribozyme; c. selecting said RNA binding protein-RNA complex with a molecule that specifically interacts with a component of said RNA binding protein-RNA complex; d. purifying said RNA binding protein-RNA complex under stringent conditions comprising washing the complex with buffer at least 5 times; boiling said RNA binding protein-RNA complex in a denaturing ionic detergent; separating said RNA binding protein-RNA complex by SDS-PAGE; transferring said RNA binding protein-RNA complex to a substrate that preferentially binds RNA covalently bound to protein over RNA not covalently bound to protein; and digesting said RNA binding protein with a protease to liberate said fragment of said RNA molecule from said RNA binding protein-RNA complex; and e. ligating nucleotide linkers to said fragment of said RNA molecule and amplifying said fragment of said RNA molecule, thereby identifying an RNA molecule interacting with an RNA binding protein of interest.
 176. The method of claim 1, wherein step (b) comprises utilization of a nuclease.
 177. The method of claim 1, further comprising labeling said RNA molecule, wherein said method improves signaling from said label.
 178. The method of claim 1, wherein said nucleotide linkers are directionally oriented. 